Calculate percentage of rows with a column above a certain value Pandas
You can compare values for greater like 30
with aggregate mean
:
df = (df.B > 30).groupby(df['A']).mean().mul(100).reset_index(name='C')
print (df)
A C
0 r 60.0
Or:
df = df.assign(C = df.B > 30).groupby('A')['C'].mean().mul(100).reset_index()
How to calculate the percentage of rows in pandas?
import pandas as pd
data = {'Conditions ': ['NormalBlink1400', 'NormalBlink2000', 'NormalBlink3000', 'NormalBlink4000','NormalNoBlink1400'],
'-1': [48, 74, 77, 67,40],
'1': [108, 124, 147, 150,119]}
df = pd.DataFrame(data)
df['perc_one '] = (df["1"] / (df["1"] + df["-1"]))
df["perc_minus_one"] = (df["-1"] / (df["1"] + df["-1"]))
Output:
df.head()
Compute row percentages in pandas DataFrame?
div + sum
For a vectorised solution, divide the dataframe along axis=0
by its sum over axis=1
. You can use set_index
+ reset_index
to ignore the identifier column.
df = df.set_index('cat')
res = df.div(df.sum(axis=1), axis=0)
print(res.reset_index())
cat val1 val2 val3 val4
0 A 0.194444 0.277778 0.000000 0.527778
1 B 0.370370 0.074074 0.037037 0.518519
2 C 0.119048 0.357143 0.142857 0.380952
Pandas - get first n-rows based on percentage
I want to pop first 5% of record
There is no built-in method but you can do this:
You can multiply
the total number of rows to your percent and use the result as parameter for head
method.
n = 5
df.head(int(len(df)*(n/100)))
So if your dataframe contains 1000
rows and n = 5%
you will get the first 50
rows.
Calculate percentage change between values of column in Pandas dataframe
pct_change
is computing a change relative to the previous value (which is why 2017 is NaN), and this doesn't seem to be what you want. If you want to compute a percentage change relative to 2019, as 2019 is already normalized to 100, simply subtract 100:
df['Percentage_Change'] = df['Index'].sub(100)
output:
Country Industry Year Index Percentage_Change
0 US Agriculture 2017 83.0 -17.0
1 US Agriculture 2018 97.2 -2.8
2 US Agriculture 2019 100.0 0.0
3 US Agriculture 2020 112.0 12.0
4 US Agriculture 2021 108.0 8.0
5 Japan Mining 2017 88.0 -12.0
6 Japan Mining 2018 93.0 -7.0
7 Japan Mining 2019 100.0 0.0
8 Japan Mining 2020 104.0 4.0
9 Japan Mining 2021 112.0 12.0
bidirectional pct_change
If you want a bidirectional pct_change
"centered" on 100, you can use masks to compute the pct_change
both ways:
df['Percentage_Change'] = (df
.assign(ref=df['Year'].eq(2019))
.groupby(['Country', 'Industry'], group_keys=False)
.apply(lambda g: g['Index'].where(g['ref'].cummax()).pct_change()
.fillna(g['Index'][::-1].pct_change().mask(g['ref'].cummax(), 0))
)
)
output:
Country Industry Year Index Percentage_Change
0 US Agriculture 2017 83.0 -0.146091
1 US Agriculture 2018 97.2 -0.028000
2 US Agriculture 2019 100.0 0.000000
3 US Agriculture 2020 112.0 0.120000
4 US Agriculture 2021 108.0 -0.035714
5 Japan Mining 2017 88.0 -0.053763
6 Japan Mining 2018 93.0 -0.070000
7 Japan Mining 2019 100.0 0.000000
8 Japan Mining 2020 104.0 0.040000
9 Japan Mining 2021 112.0 0.076923
Python Pandas – How to calculate percentage difference row wise for a entire dataframe with NaNs
Use pct_change
:
>>> df[['route']].join(df.filter(like='col').pct_change(axis=1).mul(100).round(2))
route col1 col2 col3 col4 col5
0 1 NaN NaN 0.00 0.00 0.00
1 2 NaN 433.33 0.00 145.83 0.00
2 3 NaN 2.94 68.57 -10.17 0.00
3 4 NaN 0.00 0.00 0.00 433.33
Related Topics
How to Print Colored Text to the Terminal
How to Print Just the First Letters of Each Word
Python: How to Calculate the Average Word Length in a Sentence Using the .Split Command
Get All Rows That Have Same Value in Pandas
Is There a Memory Efficient and Fast Way to Load Big Json Files
Collecting and Reporting Pytest Results
Write a Program That Find the Largest Integer in a String
Get Character Position in Alphabet
Pandas Extract Numbers from Column into New Columns
Python: Searching for Common Values in Two Files
How to Download Multiple Files or an Entire Folder from Google Colab
How to Change Default Python Version
How to Code My Bot to Generate Random Images from One Command
Python Check If Website Exists
List Append Is Overwriting My Previous Values
How to Convert a 16-Bit to an 8-Bit Image in Opencv
How to Execute Multiple Commands in a Single Session in Paramiko - Python