Pandas Groupby and Sum Only One Column

Pandas Groupby and Sum Only One Column

The only way to do this would be to include C in your groupby (the groupby function can accept a list).

Give this a try:

df.groupby(['A','C'])['B'].sum()

One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:

df.groupby(['A','C'], as_index=False)['B'].sum()

Group By - but sum one column, and show original columns

df_save =df_orig.loc[:, ["A", "C", "E"]]
df_agg = df_orig.groupby("A").agg({"B": "sum", "D" : "sum"}).reset_index()
df_merged = df_save.merge(df_agg)
for c in ["B", "D"] :
df_merged.loc[df_merged[c].duplicated(), c] = ''
















































ACEBD
AppleGreenX101
PearBrownY15523
PearYellowZ
BananaYellowP44
PlumRedR25

Pandas groupby() on one column and then sum on another

Need groupby, aggregate sum and reshape by unstack:

df = df.groupby(['Name','Year'])['Goals_scored'].sum().unstack()
print (df)
Year 2014 2015 2016
Name
John Smith 5 5 1

Alternative pivot_table:

df = df.pivot_table(index='Name',columns='Year', values='Goals_scored', aggfunc='sum')
print (df)
Year 2014 2015 2016
Name
John Smith 5 5 1

Last for column from index:

df = df.reset_index().rename_axis(None, 1)
print (df)
Name 2014 2015 2016
0 John Smith 5 5 1

Pandas - dataframe groupby - how to get sum of multiple columns

By using apply

df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]:
col3 col4
col1 col2
a c 2 4
d 1 2
b d 1 2
e 2 4

If you want to agg

df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})

Pandas GroupBy Total Row for Days of the Week, then sum only on one column

Replacing NaN values with "" should solve your query. After you add the Total row to the dfTemp DataFrame, add this line of code

CODE

dfTemp.fillna(value="", inplace=True)

If you want to avoid calculating sum on categorical variables directly,

dfTotal = pd.DataFrame({"inc_Day_of_Week": "", "inc_volume": dfTemp.inc_volume.sum()}, index=["Total"])
dfTemp = pd.concat([dfTemp, dfTotal])

OUTPUT

          inc_Day_of_Week  inc_volume
0 Monday 3.0
1 Thursday 1.0
2 Tuesday 1.0
3 Wednesday 2.0
Total 7.0

Pandas dataframe, how can I group by single column and apply sum to multiple column and add new sum column?

You can set Date as index then take sum of the columns on axis=1, then groupby level=0 and transform sum

df['Total'] = (df.set_index('Date')[["col2", "col3","col4", "col5"]].sum(1)
.groupby(level=0).transform('sum').to_numpy())


print(df)

Slno Date col2 col3 col4 col5 col6 Total
0 0 01/02/20 2 1 2 5 d 10
1 1 03/02/20 5 1 2 4 g 12
2 2 04/02/20 5 1 2 5 h 13
3 3 05/02/20 4 1 2 6 e 32 # this is duplicated per group
4 4 08/02/20 8 1 2 5 g 16
5 5 05/02/20 8 1 2 8 r 32 # this is duplicated per group

Pandas sum by groupby, but exclude certain columns

You can select the columns of a groupby:

In [11]: df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()
Out[11]:
Y1961 Y1962 Y1963
Country Item_Code
Afghanistan 15 10 20 30
25 10 20 30
Angola 15 30 40 50
25 30 40 50

Note that the list passed must be a subset of the columns otherwise you'll see a KeyError.

Pandas groupby and sum if column contains substring

Try this:

pd.concat([df,
df.groupby(df['Type'].str.split(' ').str[-1]).sum().reset_index()],
ignore_index=True)

Output:

           Type  California  New York  Georgia
0 red car 3 1 3
1 blue car 10 6 3
2 yellow car 9 1 8
3 red truck 1 10 6
4 blue truck 9 7 9
5 yellow truck 4 10 1
6 car 22 8 14
7 truck 14 27 16

Details:

You can use .str accessor to split your Type into two parts color and vehicle, then slice that list and get the last value, vehicle, with .str[-1]. Use this vehicle series to groupby your dataframe and sum. Lastly, pd.concat the results of the groupby sum with your original dataframe.



Related Topics



Leave a reply



Submit