How to Group by Column Values into Row and Column Header and Then Sum the Value

How to Transpose header row value group by particular column value

You can do this with PowerQuery.

Select any cell in your source data. Use Data>Get & Transform Data>From Table/Range.

The PowerQuery editor will open, like this:

Sample Image

Select the Cust_Name column by clicking the column header. Use Transform>Unpivot Columns>Unpivot Other Columns:

Sample Image

At this point, optionally filter the Value column to exclude 0.

Now use Home>Close & Load to put the data back into your workbook.

You can now create a pivot table to get your sub totals:

Sample Image

Here is the query from the Advanced Editor dialog in the PowerQuery editor:

let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Cust_Name", type text}, {"Prod1", Int64.Type}, {"Prod2", Int64.Type}, {"Prod3", Int64.Type}, {"Prod4", Int64.Type}, {"Prod5", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Cust_Name"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"

sum values in column grouped by another column pandas

Use groupby, transform and dropna:

print (df.assign(y=df.groupby(df["x"].notnull().cumsum())["y"].transform('sum'))
.dropna(subset=["x"]))

country id x y
0 AT 11 50.0 294
3 AT 22 40.0 50
4 AT 23 30.0 23
5 AT 61 40.0 166
7 UK 11 40.0 126

Pandas groupby and sum if column contains substring

Try this:

pd.concat([df,
df.groupby(df['Type'].str.split(' ').str[-1]).sum().reset_index()],
ignore_index=True)

Output:

           Type  California  New York  Georgia
0 red car 3 1 3
1 blue car 10 6 3
2 yellow car 9 1 8
3 red truck 1 10 6
4 blue truck 9 7 9
5 yellow truck 4 10 1
6 car 22 8 14
7 truck 14 27 16

Details:

You can use .str accessor to split your Type into two parts color and vehicle, then slice that list and get the last value, vehicle, with .str[-1]. Use this vehicle series to groupby your dataframe and sum. Lastly, pd.concat the results of the groupby sum with your original dataframe.

Read CSV group by 1 column and apply sum, without pandas

You're actually very close. Just sum the values read while rewriting the file. Note that when using with on a file, you don't have to explicitly close them, it does it for you automatically. Also note that CSV files should be opened with newline=''—for reading and writing—as per the documentation.

import csv

index = {}

with open('event.csv', newline='') as csv_file:
cr = csv.reader(csv_file)
for row in cr:
index.setdefault(row[0], []).append(int(row[1]))

with open('event2.csv', 'w', newline='\n') as csv_file:
writer = csv.writer(csv_file)
for key, values in index.items():
value = sum(values)
writer.writerow([key, value])

print('-fini-')

The above could be written a little more concisely by eliminating some temporary variables and using a generator expression:

import csv

index = {}

with open('event.csv', newline='') as csv_file:
for row in csv.reader(csv_file):
index.setdefault(row[0], []).append(int(row[1]))

with open('event2.csv', 'w', newline='\n') as csv_file:
csv.writer(csv_file).writerows([key, sum(values)] for key, values in index.items())

print('-fini-')

Pandas - dataframe groupby - how to get sum of multiple columns

By using apply

df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]:
col3 col4
col1 col2
a c 2 4
d 1 2
b d 1 2
e 2 4

If you want to agg

df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})

Make a grouped column by sum of another column with pandas

Use GroupBy.transform with sum and then for display your way create MultiIndex by DataFrame.set_index, but 'missing' values in MulitIndex are only not displaing:

df['Sum'] = df.groupby('Group')['Cost'].transform('sum')
df = df.set_index(['Group','Sum','Cost'])

Or:

df1 = (df.assign(Sum = df.groupby('Group')['Cost'].transform('sum'))
.set_index(['Group','Sum','Cost']))

Pandas Groupby and Sum Only One Column

The only way to do this would be to include C in your groupby (the groupby function can accept a list).

Give this a try:

df.groupby(['A','C'])['B'].sum()

One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:

df.groupby(['A','C'], as_index=False)['B'].sum()

Summing columns in Dataframe that have matching column headers

You probably want to groupby the first level, and over the second axis, and then perform a .sum(), like:

>>> df.groupby(level=0,axis=1).sum().add_suffix('_sum')
M1_sum M2_sum
0 4 7
1 9 12
2 12 15
3 12 18
4 25 50

If we rename the last column to M1 instead, it will again group this correctly:

>>> df
M1 M2 M1 M1
0 1 2 3 5
1 2 4 7 8
2 3 6 9 9
3 4 8 8 10
4 5 10 20 40
>>> df.groupby(level=0,axis=1).sum().add_suffix('_sum')
M1_sum M2_sum
0 9 2
1 17 4
2 21 6
3 22 8
4 65 10


Related Topics



Leave a reply



Submit