Sub Totals and Grand Totals in Python

Sub totals and grand totals in Python

You can use pivot to get from your current output to your desired output and then sum to calculate the totals you want.

import pandas as pd

df = df.reset_index().pivot('index', values='Column5', columns='Column4')

# Add grand total columns, summing across all other columns
df['Grand Total'] = df.sum(axis=1)
df.columns.name = None
df.index.name = None

# Add the grand total row, summing all values in a column
df.loc['Grand Total', :] = df.sum()

df is now:

             Actionable  Duplicate  Informative  Non-actionable  Grand Total
2018-05-19 NaN 220.0 3.0 NaN 223.0
2018-05-20 5.0 270.0 859.0 2.0 1136.0
2018-05-21 8.0 295.0 17.0 NaN 320.0
2018-05-22 10.0 424.0 36.0 NaN 470.0
2018-05-23 8.0 157.0 3.0 NaN 168.0
2018-05-24 5.0 78.0 3.0 NaN 86.0
2018-05-25 3.0 80.0 NaN NaN 83.0
Grand Total 39.0 1524.0 921.0 2.0 2486.0

Add grand total and subtotal for Pandas pivot table

A fairly similar solution to your previous question, however you can insert blank strings for your missing levels (inspired by @piRSquared here):

out = pd.concat([d.append(d.sum().rename((k, '', '', '', 'Subtotal'))) for k, d in table.groupby('Region')]).append(table.sum().rename(('Grand', '', '', '', 'Total')))

out.index = pd.MultiIndex.from_tuples(out.index)

Yields:

                                 Curr_Revenue     ...       Prior_Sales
America US AL Develop qwe 4852 ... 898
Launch lkj 3568 ... 220
Start opi 6776 ... 521
Subtotal 15196 ... 1639
Asia KR HN Develop efg 5975 ... 132
Start vcx 3046 ... 131
Subtotal 9021 ... 263
Europe GB JS Develop bcd 7668 ... 789
Launch fsd 2349 ... 569
Subtotal 10017 ... 1358
Oceanian AU TL Launch gtp 1111 ... 777
Start abc 4530 ... 235
Subtotal 5641 ... 1012
Grand Total 39875 ... 4272

Adding subtotal / grand total row in pandas muiltiindex produces tuple as an index

Here is problem in Series called s are 2 levels MultiIndex, in dd are 3 levels, so in append are created tuple.

Solution is set 3 levels MultiIndex in MultiIndex.from_product, so same number levels like dd and solution working correct:

For avoid sorting all another level in DataFrame.sort_index add sort_remaining=False:

dd =  df_from_dict.groupby(['category', 'subcategory','discount'])[['sales_1', 'sales_2']].sum()

s = dd.groupby(level=0).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total'], ['']])
print (s)

dd = dd.append(s).sort_index(level=0, sort_remaining=False)
dd.loc['Grand Total', :] = dd.sum().values / 2
print (dd)
sales_1 sales_2
category subcategory discount
Bath Table 30-40 10871.00 1983.0
Total 10871.00 1983.0
Dining Chairs 30-40 53258.04 37031.0
Total 53258.04 37031.0
Kitchen Mirror 30-40 7163.16 2166.0
Stool 30-40 6736.53 1613.0
Total 13899.69 3779.0
Living room chair 30-40 8047.16 3442.0
mirror 30-40 8601.16 11117.0
Total 16648.32 14559.0
Grand Total 94677.05 57352.0

groupby with totals/subtotals

You can aggregate by different columns, so for performance is better not use nested groupby.apply but rather multple aggregation, last join them togehether by concat, change order of columns by DataFrame.reindex and last sorting per first 2 columns:

df1 = df.groupby(['Strategy', 'AssetClass', 'Symbol'], as_index=False).sum()

df2 = (df1.groupby(['Strategy', 'AssetClass'], as_index=False)['Indicator'].sum()
.assign(Symbol = ''))

df3 = (df1.groupby('Strategy', as_index=False)['Indicator'].sum()
.assign(AssetClass = ''))

df = (pd.concat([df3, df2, df1])
.reindex(df.columns, axis=1)
.fillna('')
.sort_values(['Strategy','AssetClass'], ignore_index=True))
print (df)
Strategy AssetClass Symbol Value Indicator
0 Strat1 3.9
1 Strat1 OPT 1.2
2 Strat1 OPT OPT_ABC1 50.0 -0.3
3 Strat1 OPT OPT_ABC2 50.0 1.5
4 Strat1 STK 2.7
5 Strat1 STK STK_ABC 50.0 2.7
6 Strat2 -3.8
7 Strat2 STK -3.8
8 Strat2 STK STK_XYZ 70.0 -3.8
9 Strat3 10.0
10 Strat3 OPT 10.0
11 Strat3 OPT OPT_MNO 25.0 10.0

Subtotals and Grand totals across two axis

Admittedly not elegant, but works...

indices = ["Industry","Firm","Project"]
l = list()
for index in [indices[0],indices[0:2],indices,None]:
tmp = pd.pivot_table(df,values="Cost",index=index,columns=["Month"],aggfunc=np.sum)
tmp["Total"] = tmp.sum(axis=1)
tmp.reset_index(inplace=True)
for col in indices:
if col not in tmp.columns:
tmp[col] = ""
tmp.set_index(indices,inplace=True)
tmp.drop("index",axis=1,errors='ignore',inplace=True)
l.append(tmp)

l[-1].index = [("Total","","")]
output = pd.concat(l[:-1]).sort_index()
output = pd.concat([output,l[-1]])

output

Month                         06-2020  07-2020  08-2020    Total
Industry Firm Project
Auto 4000.0 5000.0 5000.0 14000.0
Company 1 4000.0 5000.0 5000.0 14000.0
Alpha 3000.0 0.0 0.0 3000.0
NUKDJF 1000.0 5000.0 5000.0 11000.0
Lamps 3500.0 8500.0 9500.0 21500.0
ASDF Inc. 2000.0 500.0 500.0 3000.0
BigThing 2000.0 500.0 500.0 3000.0
Super Corp 1500.0 8000.0 9000.0 18500.0
SupProj 1500.0 8000.0 9000.0 18500.0
Total 7500.0 13500.0 14500.0 35500.0

Adding a Grand Total to a Pandas Pivot Table

Based on your example that is posted:

# read your data from clipboard
df = pd.read_clipboard()

# run your pivot_table code from above
report = df.groupby(['SUPER_TYPE']).apply(lambda sub_df: sub_df.pivot_table(index=['STRATA', 'OS_TYPE', 'STAND_NUMB', 'SILV_PRES'], values=['ACRES'],aggfunc=np.sum, margins=True,margins_name= 'TOTAL'))

# this is creating a new row at level(1) called grand total
# set it equal to the sum of ACRES where level(1) != 'Total' so you are not counting the calculated totals in the total sum
report.loc[('', 'Grand Total','','',''), :] = report[report.index.get_level_values(1) != 'TOTAL'].sum()
report


ACRES
SUPER_TYPE STRATA OS_TYPE STAND_NUMB SILV_PRES
HS HS3B HS3B 3092.0 OSR/2SS/SCC 17.3
3580.0 OSR/2SS/SCC 8.1
3581.0 OSR/2SS/SCC 16.6
3587.0 OSR/2SS/SCC 13.8
3594.0 OSR/2SS/SCC 31.7
3607.0 OSR/2SS/SCC 27.7
TOTAL 115.2
HW H3AB H3A 3571.0 OSR/2SS/SCC 30.7
3573.0 OSR/2SS/SCC 30.4
3585.0 OSR/2SS/SCC 25.8
3588.0 OSR/2SS/SCC 18.1
3589.0 OSR/2SS/SCC 54.7
3597.0 OSR/2SS/SCC 41.6
3601.0 OSR/2SS/SCC 11.9
. . .

Grand Total 813.6


Related Topics



Leave a reply



Submit