Pandas: Calculate Total Percent Difference Between Two Data Frames

How to calculate percentage difference between two data frames with Pandas?

You can simply divide df2 by df1 on the columns of interest:

df2.loc[:,"'abc'":] = df2.loc[:,"'abc'":].div(df1.loc[:,"'abc'":]).mul(100)

ID 'abc' 'dfe'
0 Total 75.0 80.0
1 Slow NaN NaN
2 Normal 0.0 50.0
3 Fast 100.0 100.0

Update

In order to format as specified, you can do:

df2.loc[:,"'abc'":] = df2.where(df2.loc[:,"'abc'":].isna(), 
df2.round(2).astype(str).add('%'))

ID 'abc' 'dfe'
0 Total 75.0% 80.0%
1 Slow NaN NaN
2 Normal 0.0% 50.0%
3 Fast 100.0% 100.0%

Given that there are no decimal places, other than .0, round(2) has no effect on the displayed floats, however as soon as there is some float with more decimal places after having divided, you will see the 2 decimal positions for all floats.

Compute the percentage change over multiple pandas dataframes

Create index by strings columns, divide DataFrames by DataFrame.div, subtract 1 by DataFrame.sub, multiple by DataFrame.mul:

df = df2.set_index('summary').div(df1.set_index('summary')).sub(1).mul(100).reset_index()
print (df)
summary col1 col2 col3
0 count 50.0 50.0 -50.0
1 mean -50.0 -50.0 -50.0
2 stddev 0.0 0.0 0.0
3 min NaN 0.0 0.0
4 max -50.0 0.0 0.0

EDIT:

If need pct_change between DataFrames in list, df1 with df2, df2 with df3...:

L = [df1, df2]
df = (pd.concat(L, keys=range(len(L)))
.set_index('summary', append=True)
.groupby(level=1)
.pct_change())

print (df)
col1 col2 col3
summary
0 0 count NaN NaN NaN
1 mean NaN NaN NaN
2 stddev NaN NaN NaN
3 min NaN NaN NaN
4 max NaN NaN NaN
1 0 count 0.5 0.5 -0.5
1 mean -0.5 -0.5 -0.5
2 stddev 0.0 0.0 0.0
3 min NaN 0.0 0.0
4 max -0.5 0.0 0.0

How to calculate percentage change between two years and insert in a new DataFrame in Pandas?

You can create decade column, then use DataFrame.pivot_table with sum and add DataFrame.pct_change:

d = df['year'] // 10 * 10
df['dec'] = (d + 1).astype(str) + '-' + (d + 10).astype(str)

Another idea with cut:

bins = range(df['year'].min(), df['year'].max() + 10, 10)
labels = [f'{i}-{j-1}' for i, j in zip(bins[:-1], bins[1:])]

df['dec'] = pd.cut(df.year, bins=bins, labels=labels, include_lowest=True)


df1 = (df.pivot_table(index='country', 
columns='dec',
values='population',
aggfunc='sum')
.pct_change(axis=1))

How to calculate percent change compared to the beginning value using pandas?

Sounds like you are looking for an expanding_window version of pct_change(). This doesn't exist out of the box AFAIK, but you could roll your own:

df.groupby('security')['price'].apply(lambda x: x.div(x.iloc[0]).subtract(1).mul(100))

pandas df - calculate percentage difference not change

Divide difference by diff with absolute values by abs with rolling mean:

s = df['Radisson Collection'].rolling(2).mean()
df['new'] = df['Radisson Collection'].diff().abs().div(s) * 100
print (df)
Radisson Collection new
Total awareness 0.440553 NaN
Very/Somewhat familiar 0.462577 4.877260
Consideration 0.494652 6.701636
Ever used 0.484620 2.048869

If need percentages:

df['new'] = (df['Radisson Collection'].diff().abs().div(s) * 100)
.iloc[1:].round(5).astype(str) + '%'


Related Topics



Leave a reply



Submit