Pandas: Starting from the Second Row. Subtract from Previous Row and Use It as Value to the Next Subtraction

Pandas: Starting from the second row. subtract from previous row and use it as value to the next subtraction

Numpy, `cumsum` with alternating sign

i = np.arange(len(df))
j = np.arange(2)

a = np.where(
    (i[:, None] + j) % 2 == 0, 1, -1
) * df.VALUE.values[:, None]

b = a.cumsum(0)[i, i % 2]

df.assign(VALUE=b)

   ID  VALUE
0   0      1
1   1      9
2   2     21
3   3     24
4   4     54

Explanation

First thing is to notice that

X0 ->                     X0
X1 ->                X1 - X0
X2 ->           X2 - X1 + X0
X3 ->      X3 - X2 + X1 - X0
X4 -> X4 - X3 + X2 - X1 + X0

So I wanted to multiply every other row by negative one... but I needed to do this twice for the other choice of alternating rows.

I needed to generate a mask that swaps between + and - 1 for both options

i = np.arange(len(df))
j = np.arange(2)

m = np.where(
    (i[:, None] + j) % 2 == 0, 1, -1
)

m

array([[ 1, -1],
       [-1,  1],
       [ 1, -1],
       [-1,  1],
       [ 1, -1]])

Now I need to broadcast multiply this across my df.VALUE

a = m * df.VALUE.values[:, None]

a

array([[  1,  -1],
       [-10,  10],
       [ 30, -30],
       [-45,  45],
       [ 78, -78]])

Notice the pattern. Now I cumsum

a.cumsum(0)

array([[  1,  -1],
       [ -9,   9],
       [ 21, -21],
       [-24,  24],
       [ 54, -54]])

But I need the positive ones... more specifically, I need the alternating ones. So I slice with a modded arange

b = a.cumsum(0)[i, i % 2]
b

array([ 1,  9, 21, 24, 54])

This is what I ended up assigning to the existing column

df.assign(VALUE=b)

   ID  VALUE
0   0      1
1   1      9
2   2     21
3   3     24
4   4     54

This produces a copy of df and overwrites the VALUE column with b.

To persist this answer, make sure to reassign to a new name or df if you want.

df_new = df.assign(VALUE=b)

pandas subtracting value in another column from previous row

Here is one potential way to do this.

First create a boolean mask, then use numpy.where and Series.shift to create the column date_difference:

mask = df.duplicated(['identifier', 'id_number'])

df['date_difference'] = (np.where(mask, (df['contract_year_month'] - 
                                         df['collection_year_month'].shift(1)).dt.days, np.nan))

[output]

    identifier  id_number   contract_year_month collection_year_month   date_difference
0   K001    1   2018-01-03  2018-01-09  NaN
1   K001    1   2018-01-08  2018-01-10  -1.0
2   K001    2   2018-01-01  2018-01-05  NaN
3   K001    2   2018-01-15  2018-01-18  10.0
4   K002    4   2018-01-04  2018-01-07  NaN
5   K002    4   2018-01-09  2018-01-15  2.0

Conditional shift: Subtract 'previous row value' from 'current row value' with multiple conditions in pandas

You may try something like this:

df['DiffHeartRate']=(df.groupby(['Disease', 'State', 
          (df.MonthStart.dt.month.ne(df.MonthStart.dt.month.shift()+1)).cumsum()])['HeartRate']
 .apply(lambda x: x.diff())).fillna(df.HeartRate)

    Disease HeartRate   State   MonthStart  MonthEnd    DiffHeartRate
0   Covid   89          Texas   2020-02-28  2020-03-31  89.0
1   Covid   91          Texas   2020-03-31  2020-04-30  2.0
2   Covid   87          Texas   2020-07-31  2020-08-30  87.0
3   Cancer  90          Texas   2020-02-28  2020-03-31  90.0
4   Cancer  88          Florida 2020-03-31  2020-04-30  88.0
5   Covid   89          Florida 2020-02-28  2020-03-31  89.0
6   Covid   87          Florida 2020-03-31  2020-04-30  -2.0
7   Flu     90          Florida 2020-02-28  2020-03-31  90.0

Logic is same as the other answers but different way of representing.

Subtract previous row value from the current row value in a Pandas column

Use pandas.Series.diff with fillna:

import pandas as pd

s = pd.Series([11,15,22,27,36,69,77])
s.diff().fillna(s)

Output:

0    11.0
1     4.0
2     7.0
3     5.0
4     9.0
5    33.0
6     8.0
dtype: float64

How do I subtract the previous row from the current row in a pandas dataframe and apply it to every row; without using a loop?

you can use pct_change() or/and diff() methods

Demo:

In [138]: df.Close.pct_change() * 100
Out[138]:
0         NaN
1    0.469484
2    0.467290
3   -0.930233
4    0.469484
5    0.467290
6    0.000000
7   -3.255814
8   -3.365385
9   -0.497512
Name: Close, dtype: float64

In [139]: df.Close.diff()
Out[139]:
0      NaN
1    0.125
2    0.125
3   -0.250
4    0.125
5    0.125
6    0.000
7   -0.875
8   -0.875
9   -0.125
Name: Close, dtype: float64

How to subtract rows between two different dataframes and replace original value?

First solution is create index in df22 by Bankname for align by df1 for correct row subracting:

df.set_index('BankName').sub(df2.set_index([['Bank1']]), fill_value=0)

df.set_index('BankName').sub(df2.set_index([['Bank2']]), fill_value=0)

You need create new column to df2 with BankName, convert BankName to index in both DataFrames, so possible subtract by this row:

df22 = df2.assign(BankName = 'Bank1').set_index('BankName')
df = df1.set_index('BankName').sub(df22, fill_value=0).reset_index()
print (df)
  BankName  Value1  Value2
0    Bank1     7.0    53.0
1    Bank2    15.0    65.0
2    Bank3    14.0    54.0

Subtract by Bank2:

df22 = df2.assign(BankName = 'Bank2').set_index('BankName')
df = df1.set_index('BankName').sub(df22, fill_value=0).reset_index()
print (df)

  BankName  Value1  Value2
0    Bank1    10.0    55.0
1    Bank2    12.0    63.0
2    Bank3    14.0    54.0

Another solution with filter by BankName:

m = df1['BankName']=='Bank1'
df1.loc[m, df2.columns] = df1.loc[m, df2.columns].sub(df2.iloc[0])
print (df1)
  BankName  Value1  Value2
0    Bank1       7      53
1    Bank2      15      65
2    Bank3      14      54

m = df1['BankName']=='Bank2'
df1.loc[m, df2.columns] = df1.loc[m, df2.columns].sub(df2.iloc[0])

Python Pandas Conditional Sum and subtract previous row

You can use .cumsum() to calculate a cumulative sum of the column:

df = pd.DataFrame({
    'column1': [50, 100, 30, 0, 30, 80, 0], 
    'column2': [0, 0, 0, 10, 0, 0, 30],
})

df['column3'] = df['column1'].cumsum() - df['column2'].cumsum()

This results in:

    column1 column2 column3
0    50     0        50
1   100     0       150
2    30     0       180
3     0    10       170
4    30     0       200
5    80     0       280
6     0    30       250

Pandas: Starting from the Second Row. Subtract from Previous Row and Use It as Value to the Next Subtraction