pandas convert columns to percentages of the totals
You can do this using basic pandas operators .div
and .sum
, using the axis
argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1
).axis=1
makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0
).axis=0
makes the division happen across the columns. - To finish, multiply the results by
100
so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
Python Pandas - Convert column to percentage on Groupby DF
Here's one way
In [1371]: (100. * df / df.sum()).round(0)
Out[1371]:
2014/2015 2015/2016 2016/2017 2017/2018
FinancialYear
April 9.0 6.0 6.0 24.0
May 7.0 7.0 6.0 29.0
June 5.0 10.0 6.0 24.0
July 6.0 10.0 5.0 24.0
August 9.0 8.0 10.0 0.0
September 9.0 6.0 12.0 0.0
October 9.0 10.0 8.0 0.0
November 10.0 8.0 9.0 0.0
December 9.0 11.0 7.0 0.0
January 9.0 8.0 10.0 0.0
February 9.0 7.0 11.0 0.0
March 9.0 8.0 10.0 0.0
And, if you want to rounded to 1 decimal place with value as strings with '%'
In [1375]: (100. * df / df.sum()).round(1).astype(str) + '%'
Out[1375]:
2014/2015 2015/2016 2016/2017 2017/2018
FinancialYear
April 8.7% 6.4% 6.4% 23.5%
May 7.4% 6.9% 6.1% 29.4%
June 4.8% 10.3% 6.4% 23.5%
July 5.9% 10.3% 5.2% 23.5%
August 9.2% 8.0% 9.9% 0.0%
September 8.9% 6.1% 12.0% 0.0%
October 9.2% 9.8% 7.9% 0.0%
November 9.7% 8.2% 8.7% 0.0%
December 9.2% 10.9% 6.7% 0.0%
January 8.7% 8.0% 10.2% 0.0%
February 9.4% 6.9% 10.8% 0.0%
March 9.2% 8.2% 9.6% 0.0%
Convert Pandas dataframe values to percentage
Use
>>> df.iloc[:, 1:] = df.iloc[:, 1:].div(df['total'], axis=0).mul(100).round(2).astype(str).add(' %')
>>> df
ID col1 col2 col3 total
0 1 28.57 % 42.86 % 28.57 % 100.0 %
1 2 16.67 % 33.33 % 50.0 % 100.0 %
2 3 25.0 % 37.5 % 37.5 % 100.0 %
Pandas - Converting columns in percentage based on first columns value
try via div()
,mul()
and astype()
method:
df[['x%','y%']]=df[['x','y']].div(df['total'],axis=0).mul(100).astype(int)
output of df
:
categorie total x y x% y%
0 a 100 10 100 10 100
1 b 1000 100 1000 10 100
2 c 500 5 500 1 100
How to convert a column in a dataframe from decimals to percentages with python
Toy example:
df=DataFrame({
'No Show (%)':[5e-01, 4e-01]
})
df
Input
No Show (%)
0 0.5
1 0.4
Code
mergedgc.style.format({"No Show (%)": "{:.2%}"})
can be replaced by
df['No Show (%)'] = df['No Show (%)'].transform(lambda x: '{:,.2%}'.format(x))
Output
No Show (%)
0 50.00%
1 40.00%
Edit
Plot
df['No Show (%)'].replace('\%','', regex=True).astype(float).plot()
Pandas transform columns into percentage by group
You can use transform
after the groupby
and assign the results directly to the column 'record'
:
gender_mix['record'] = gender_mix\
.groupby(['user', 'generation'])['record']\
.transform(lambda x: round((x/sum(x)*100)).astype(int))
Python/Pandas convert pivot table into percentages based on row total
- You can use the
.pipe
method combined with.div
to perform this column-wise division on all of the columns. - You can then use
.applymap
to apply a string fomratting to each of your values to get the values to appear as percentages (note that they become strings and are no longer mathematically useful)
out = (
df.pivot_table(
index='Name', columns='scenario', values='y',
aggfunc=np.count_nonzero, margins=True,
margins_name='Total', fill_value=0
)
.pipe(lambda d: d.div(d['Total'], axis='index'))
.applymap('{:.0%}'.format)
)
example
df = pd.DataFrame({
'a': [1, 0, 0, 1, 5],
'b': [20, 20, 10, 50, 15],
'c': [50, 20, 50, 100, 20]
})
print(df)
a b c
0 1 20 50
1 0 20 20
2 0 10 50
3 1 50 100
4 5 15 20
out = (
df.pipe(lambda d: d.div(d['c'], axis='index'))
.applymap('{:.0%}'.format)
)
print(out)
a b c
0 2% 40% 100%
1 0% 100% 100%
2 0% 20% 100%
3 1% 50% 100%
4 25% 75% 100%
Add column for percentage of total to Pandas dataframe
Option 1
df['DAYSLATE_pct'] = df.DAYSLATE / df.DAYSLATE.sum()
Option 2
Use pd.value_counts
instead of groupby
pre_df.DAYSLATE.value_counts(normalize=True)
Calculating percentages for multiple columns
You can groupby
'Page Name' and 'candidato' then find the sum of each of 'Total Interactions','Likes','Comments','Shares','Love','Angry' for each page name and each candidate: totals
.
Then use groupby
on totals
by the first index level (which is "page name") and transform sum function so that you get the sum for each page name transformed for totals
and divide totals
by it to find the percentages.
Finally join
the two DataFrames for the final outcome.
totals = df.groupby(['Page Name','candidato'])[['Total Interactions','Likes','Comments','Shares','Love','Angry']].sum()
percentages = totals.groupby(level=0).transform('sum').rdiv(totals).mul(100).round(2)
out = totals.join(percentages, lsuffix='', rsuffix='_Percentages').reset_index()
This produces a DataFrame that can produce the plot in the question.
Best Way to Calculate Column Percentage of Total Pandas
>>> print df.drop('Total', axis=1).divide(df.Total, axis=0)
Group1 Group2
0 0.600 0.400
1 0.950 0.050
Related Topics
Python/Regex - How to Extract Date from Filename Using Regular Expression
Python Pandas: Nameerror: Name Is Not Defined
Paramiko Capturing Command Output
Python: Printing Horizontally Rather Than Current Default Printing
Valueerror: Feature_Names Mismatch: in Xgboost in the Predict() Function
Python List - Only Keep Only-Positive or Only-Negative Values
How to Change a Two Dimensional Array to One Dimensional
How to Change a Dataframe Column from String Type to Double Type in Pyspark
How to Create an Automatically Updating Gui Using Tkinter
Counting the Number of Duplicates in a List
What Is the Most Pythonic Way to Check If Multiple Variables Are Not None
Conditional Row Read of CSV in Pandas
Replacing Values in a Dataframe for Given Indices
Pandas - How to Compare 2 CSV Files and Output Changes
Json.Loads() Decodes Only With Raw String Literal
How to Read a List of Parquet Files from S3 as a Pandas Dataframe Using Pyarrow
Convert HTML String to an Image in Python
How to Determine Whether a Column/Variable Is Numeric or Not in Pandas/Numpy