How to Transpose header row value group by particular column value
You can do this with PowerQuery.
Select any cell in your source data. Use Data>Get & Transform Data>From Table/Range.
The PowerQuery editor will open, like this:
Select the Cust_Name column by clicking the column header. Use Transform>Unpivot Columns>Unpivot Other Columns:
At this point, optionally filter the Value column to exclude 0.
Now use Home>Close & Load to put the data back into your workbook.
You can now create a pivot table to get your sub totals:
Here is the query from the Advanced Editor dialog in the PowerQuery editor:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Cust_Name", type text}, {"Prod1", Int64.Type}, {"Prod2", Int64.Type}, {"Prod3", Int64.Type}, {"Prod4", Int64.Type}, {"Prod5", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Cust_Name"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"
sum values in column grouped by another column pandas
Use groupby
, transform
and dropna
:
print (df.assign(y=df.groupby(df["x"].notnull().cumsum())["y"].transform('sum'))
.dropna(subset=["x"]))
country id x y
0 AT 11 50.0 294
3 AT 22 40.0 50
4 AT 23 30.0 23
5 AT 61 40.0 166
7 UK 11 40.0 126
Pandas groupby and sum if column contains substring
Try this:
pd.concat([df,
df.groupby(df['Type'].str.split(' ').str[-1]).sum().reset_index()],
ignore_index=True)
Output:
Type California New York Georgia
0 red car 3 1 3
1 blue car 10 6 3
2 yellow car 9 1 8
3 red truck 1 10 6
4 blue truck 9 7 9
5 yellow truck 4 10 1
6 car 22 8 14
7 truck 14 27 16
Details:
You can use .str
accessor to split
your Type into two parts color and vehicle, then slice that list and get the last value, vehicle, with .str[-1]
. Use this vehicle series to groupby
your dataframe and sum
. Lastly, pd.concat
the results of the groupby sum with your original dataframe.
Read CSV group by 1 column and apply sum, without pandas
You're actually very close. Just sum the values read while rewriting the file. Note that when using with
on a file, you don't have to explicitly close them, it does it for you automatically. Also note that CSV files should be opened with newline=''
—for reading and writing—as per the documentation.
import csv
index = {}
with open('event.csv', newline='') as csv_file:
cr = csv.reader(csv_file)
for row in cr:
index.setdefault(row[0], []).append(int(row[1]))
with open('event2.csv', 'w', newline='\n') as csv_file:
writer = csv.writer(csv_file)
for key, values in index.items():
value = sum(values)
writer.writerow([key, value])
print('-fini-')
The above could be written a little more concisely by eliminating some temporary variables and using a generator expression:
import csv
index = {}
with open('event.csv', newline='') as csv_file:
for row in csv.reader(csv_file):
index.setdefault(row[0], []).append(int(row[1]))
with open('event2.csv', 'w', newline='\n') as csv_file:
csv.writer(csv_file).writerows([key, sum(values)] for key, values in index.items())
print('-fini-')
Pandas - dataframe groupby - how to get sum of multiple columns
By using apply
df.groupby(['col1', 'col2'])["col3", "col4"].apply(lambda x : x.astype(int).sum())
Out[1257]:
col3 col4
col1 col2
a c 2 4
d 1 2
b d 1 2
e 2 4
If you want to agg
df.groupby(['col1', 'col2']).agg({'col3':'sum','col4':'sum'})
Make a grouped column by sum of another column with pandas
Use GroupBy.transform
with sum
and then for display your way create MultiIndex
by DataFrame.set_index
, but 'missing'
values in MulitIndex
are only not displaing:
df['Sum'] = df.groupby('Group')['Cost'].transform('sum')
df = df.set_index(['Group','Sum','Cost'])
Or:
df1 = (df.assign(Sum = df.groupby('Group')['Cost'].transform('sum'))
.set_index(['Group','Sum','Cost']))
Pandas Groupby and Sum Only One Column
The only way to do this would be to include C in your groupby (the groupby function can accept a list).
Give this a try:
df.groupby(['A','C'])['B'].sum()
One other thing to note, if you need to work with df after the aggregation you can also use the as_index=False
option to return a dataframe object. This one gave me problems when I was first working with Pandas. Example:
df.groupby(['A','C'], as_index=False)['B'].sum()
Summing columns in Dataframe that have matching column headers
You probably want to groupby
the first level, and over the second axis, and then perform a .sum()
, like:
>>> df.groupby(level=0,axis=1).sum().add_suffix('_sum')
M1_sum M2_sum
0 4 7
1 9 12
2 12 15
3 12 18
4 25 50
If we rename the last column to M1
instead, it will again group this correctly:
>>> df
M1 M2 M1 M1
0 1 2 3 5
1 2 4 7 8
2 3 6 9 9
3 4 8 8 10
4 5 10 20 40
>>> df.groupby(level=0,axis=1).sum().add_suffix('_sum')
M1_sum M2_sum
0 9 2
1 17 4
2 21 6
3 22 8
4 65 10
Related Topics
Find and Remove a Line in Multiple Files
How to Pipe or Redirect the Output of Curl -V
Find All Zero-Byte Files in Directory and Subdirectories
How to Pass Argument in Expect Through the Command Line in a Shell Script
How to Speed Up Linux Kernel Compilation
How Does Epoll's Epollexclusive Mode Interact with Level-Triggering
Changing Highlight Line Color in Emacs
Iproute2 Commands for Mpls Configuration
What Is the Fastest Way to Find All the File with the Same Inode
How and When to Use /Dev/Shm for Efficiency
How to Filter the Running Nodes
Using a Remote Host's Usb Port as Local Usb (Linux and Windows)
Linux Free Shows High Memory Usage But Top Does Not
How to Run Nohup and Write Its Pid File in a Single Bash Statement
Why Does the Linux Kernel Repository Have Only One Branch
How to Download a File from the Internet to My Linux Server with Bash
How to Add a Custom Footer to Sphinx Documentation? (Restructuredtext)