Pandas create new column with count from groupby
That's not a new column, that's a new DataFrame:
In [11]: df.groupby(["item", "color"]).count()
Out[11]:
id
item color
car black 2
truck blue 1
red 2
To get the result you want is to use reset_index
:
In [12]: df.groupby(["item", "color"])["id"].count().reset_index(name="count")
Out[12]:
item color count
0 car black 2
1 truck blue 1
2 truck red 2
To get a "new column" you could use transform:
In [13]: df.groupby(["item", "color"])["id"].transform("count")
Out[13]:
0 2
1 2
2 2
3 1
4 2
dtype: int64
I recommend reading the split-apply-combine section of the docs.
Adding a 'count' column to the result of a groupby in pandas?
You can using size
df.groupby(['A','B']).size()
Out[590]:
A B
x p 2
y q 1
z r 2
dtype: int64
For your solution adding one of the columns
df.groupby(['A','B']).B.agg('count')
Out[591]:
A B
x p 2
y q 1
z r 2
Name: B, dtype: int64
Update :
df.groupby(['A','B']).B.agg('count').to_frame('c').reset_index()
#df.groupby(['A','B']).size().to_frame('c').reset_index()
Out[593]:
A B c
0 x p 2
1 y q 1
2 z r 2
Pandas, group by count and add count to original dataframe?
IIUC
In [247]: df['count'] = df.groupby('kind').transform('count')
In [248]: df
Out[248]:
kind msg count
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
sorting:
In [249]: df.sort_values('count', ascending=False)
Out[249]:
kind msg count
3 bb bb text 1 4
4 bb bb text 2 4
5 bb bb text 3 4
6 bb bb text 4 4
0 aaa aaa text 1 3
1 aaa aaa text 2 3
2 aaa aaa text 3 3
7 cccc cccc text 1 2
8 cccc cccc text 2 2
9 dd dd text 1 1
10 e e text 1 1
11 fff fff text 1 1
Pandas GroupBy and add count of unique values as a new column
Use transform
to broadcast the result:
df['timestamp_count'] = (
df.groupby(["source", "day"])['timestamp'].transform('nunique'))
df
day source timestamp timestamp_count
0 1 facebook 2018-08-04 11:16:32.416 2
1 1 facebook 2019-01-03 10:25:38.216 2
2 1 twitter 2018-10-14 13:26:22.123 1
3 2 facebook 2019-01-30 12:16:32.416 1
Python pandas - After groupby, how to create new columns based on values in other columns
You can combine a groupby.agg
and pivot_table
:
(df
.groupby(['col1', 'col2'])
.agg(**{'New_col3': ('col3', lambda x: '/'.join(sorted(x)))})
.join(df.pivot_table(index=['col1', 'col2'],
columns='col3',
values='col4',
fill_value=0)
.add_suffix('_count')
)
.reset_index()
)
Output:
col1 col2 New_col3 L_count W_count
0 A D L/W 2 1
1 A T L/W 3 2
2 B D L/W 2 3
3 C T W 0 2
Used input:
df = pd.DataFrame({'col1': list('AAAABBC'),
'col2': list('DDTTDDT'),
'col3': list('LWLWWLW'),
'col4': (2,1,3,2,3,2,2)})
Related Topics
Pyspark: Split Multiple Array Columns into Rows
Plotting 3D Polygons in Python-Matplotlib
How to Decorate an Instance Method with a Decorator Class
Force Numpy Ndarray to Take Ownership of Its Memory in Cython
Multiprocessing.Pool: What's the Difference Between Map_Async and Imap
Can't Subtract Offset-Naive and Offset-Aware Datetimes
Fastest Way to Convert a Dict's Keys & Values from 'Unicode' to 'Str'
Display Special Characters When Using Print Statement
How to Run an Ipython Magic from a Script (Or Timing a Python Script)
Concatenate Numpy Arrays Without Copying
Mysql-Python Install Error: Cannot Open Include File 'Config-Win.H'
A Fast Way to Find the Largest N Elements in an Numpy Array
Removing Unicode \U2026 Like Characters in a String in Python2.7
Not All Parameters Were Used in the SQL Statement (Python, MySQL)
How to Check If a Value Exists in a Dictionary