Pandas Groupby and Sum Only One Column
The only way to do this would be to include C in your groupby (the groupby function can accept a list).
Give this a try:
df.groupby(['A','C'])['B'].sum()
One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False
option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:df.groupby(['A','C'], as_index=False)['B'].sum()
Group By - but sum one column, and show original columns
df_save =df_orig.loc[:, ["A", "C", "E"]]
df_agg = df_orig.groupby("A").agg({"B": "sum", "D" : "sum"}).reset_index()
df_merged = df_save.merge(df_agg)
for c in ["B", "D"] :
df_merged.loc[df_merged[c].duplicated(), c] = ''
A | C | E | B | D |
---|---|---|---|---|
Apple | Green | X | 10 | 1 |
Pear | Brown | Y | 155 | 23 |
Pear | Yellow | Z | ||
Banana | Yellow | P | 4 | 4 |
Plum | Red | R | 2 | 5 |
Pandas groupby() on one column and then sum on another
Need groupby
, aggregate sum
and reshape by unstack
:
df = df.groupby(['Name','Year'])['Goals_scored'].sum().unstack()
print (df)
Year 2014 2015 2016
Name
John Smith 5 5 1
Alternative pivot_table
:df = df.pivot_table(index='Name',columns='Year', values='Goals_scored', aggfunc='sum')
print (df)
Year 2014 2015 2016
Name
John Smith 5 5 1
Last for column from index:df = df.reset_index().rename_axis(None, 1)
print (df)
Name 2014 2015 2016
0 John Smith 5 5 1
Pandas - dataframe groupby - how to get sum of multiple columns
By using apply
df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]:
col3 col4
col1 col2
a c 2 4
d 1 2
b d 1 2
e 2 4
If you want to agg
df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})
Pandas GroupBy Total Row for Days of the Week, then sum only on one column
Replacing NaN
values with ""
should solve your query. After you add the Total
row to the dfTemp
DataFrame, add this line of code
CODE
dfTemp.fillna(value="", inplace=True)
If you want to avoid calculating sum on categorical variables directly,dfTotal = pd.DataFrame({"inc_Day_of_Week": "", "inc_volume": dfTemp.inc_volume.sum()}, index=["Total"])
dfTemp = pd.concat([dfTemp, dfTotal])
OUTPUT inc_Day_of_Week inc_volume
0 Monday 3.0
1 Thursday 1.0
2 Tuesday 1.0
3 Wednesday 2.0
Total 7.0
Pandas dataframe, how can I group by single column and apply sum to multiple column and add new sum column?
You can set Date
as index then take sum of the columns on axis=1, then groupby level=0
and transform sum
df['Total'] = (df.set_index('Date')[["col2", "col3","col4", "col5"]].sum(1)
.groupby(level=0).transform('sum').to_numpy())
print(df)
Slno Date col2 col3 col4 col5 col6 Total
0 0 01/02/20 2 1 2 5 d 10
1 1 03/02/20 5 1 2 4 g 12
2 2 04/02/20 5 1 2 5 h 13
3 3 05/02/20 4 1 2 6 e 32 # this is duplicated per group
4 4 08/02/20 8 1 2 5 g 16
5 5 05/02/20 8 1 2 8 r 32 # this is duplicated per group
Pandas sum by groupby, but exclude certain columns
You can select the columns of a groupby:
In [11]: df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()
Out[11]:
Y1961 Y1962 Y1963
Country Item_Code
Afghanistan 15 10 20 30
25 10 20 30
Angola 15 30 40 50
25 30 40 50
Note that the list passed must be a subset of the columns otherwise you'll see a KeyError. Pandas groupby and sum if column contains substring
Try this:
pd.concat([df,
df.groupby(df['Type'].str.split(' ').str[-1]).sum().reset_index()],
ignore_index=True)
Output: Type California New York Georgia
0 red car 3 1 3
1 blue car 10 6 3
2 yellow car 9 1 8
3 red truck 1 10 6
4 blue truck 9 7 9
5 yellow truck 4 10 1
6 car 22 8 14
7 truck 14 27 16
Details:You can use .str
accessor to split
your Type into two parts color and vehicle, then slice that list and get the last value, vehicle, with .str[-1]
. Use this vehicle series to groupby
your dataframe and sum
. Lastly, pd.concat
the results of the groupby sum with your original dataframe.
Related Topics
Differencebetween Np.Array() and Np.Asarray()
Adding a Y-Axis Label to Secondary Y-Axis in Matplotlib
Representing and Solving a Maze Given an Image
Python Pip on Windows - Command 'Cl.Exe' Failed
How to Concatenate Three Excels Files Xlsx Using Python
Hide Chromedriver Console in Python
Why Do We Need to Call Zero_Grad() in Pytorch
Django/Python Beginner: Error When Executing Python Manage.Py Syncdb - Psycopg2 Not Found
How to Create Module-Wide Variables in Python
How to Skip Iterations in a Loop
How to Exit from Python Without Traceback
Does Conda Replace the Need for Virtualenv
Turning Off Logging in Selenium (From Python)