Python Pandas - Get Row Based on Previous Row Value

get previous row's value and calculate new column pandas python

The way to get the previous is using the shift method:

In [11]: df1.change.shift(1)
Out[11]:
0 NaT
1 2014-03-08
2 2014-04-08
3 2014-05-08
4 2014-06-08
Name: change, dtype: datetime64[ns]

Now you can subtract these columns. Note: This is with 0.13.1 (datetime stuff has had a lot of work recently, so YMMV with older versions).

In [12]: df1.change.shift(1) - df1.change
Out[12]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 0 days
Name: change, dtype: timedelta64[ns]

You can just apply this to each case/group:

In [13]: df.groupby('case')['change'].apply(lambda x: x.shift(1) - x)
Out[13]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 NaT
dtype: timedelta64[ns]

how to subtract previous row value from current row value based on condition in pandas DataFrame?

Update

You have to create a group each time ID is 1 or 8 then apply diff per group. Finally, fill values with original data for each ID 1 and 8:

idx = pd.IndexSlice[:, 'Jan':'AnnualMean']
df.loc[idx] = df.loc[idx].groupby(df['ID'].isin([1, 8]).cumsum()) \
.diff().fillna(df).astype(int)

Output:

>>> df
ID Jan Feb Mrz Apr Mai Jun Jul Aug Sep Okt Nov Dez AnnualMean
0 1 14 18 17 45 22 31 30 4 22 26 12 48 24
1 2 -6 17 4 -14 28 18 -10 25 -5 23 5 -45 3
2 3 11 -31 -13 -4 -21 -12 5 -22 -15 -47 32 25 -7
3 4 -16 46 41 -7 -12 -8 10 32 6 40 -8 6 11
4 5 30 -48 -37 -5 32 20 11 -14 31 -31 1 4 -1
5 6 -21 17 2 23 -41 -7 -41 9 -3 18 -30 12 -5
6 7 4 29 15 -24 33 -36 4 -31 -32 4 0 -46 -7
7 8 25 24 4 26 7 45 17 2 47 17 19 3 20
8 9 22 12 30 -2 10 0 -14 30 -20 -2 27 46 11
9 10 3 -21 8 21 -4 -36 28 -22 22 -14 -16 -12 -3
10 1 22 26 32 50 22 30 48 27 19 27 44 19 31
11 2 5 19 11 -43 26 -17 -5 -26 26 -19 -33 -15 -6
12 3 -3 -41 -31 -2 -38 36 -19 15 -35 34 35 21 -3
13 4 21 28 9 0 20 -44 3 7 -6 -34 -25 -2 -2
14 5 -7 -4 -17 3 -26 15 9 -10 7 6 -10 -12 -3
15 6 4 18 24 34 42 23 -29 -5 29 16 22 -10 14
16 7 0 -35 9 -9 -30 -16 2 15 2 10 -4 34 -2
17 8 40 27 45 24 28 34 4 10 28 16 41 27 27
18 9 -36 -23 -44 -18 -20 0 39 38 -18 -6 -4 2 -7
19 10 35 13 17 17 19 -2 -29 -33 -2 35 -9 11 6

Old answer

Use where:

>>> df.set_index('ID').where(~df['ID'].between(1, 8), other=df.set_index('ID').diff()).reset_index().fillna(df)

ID Jan Feb Mrz Apr Mai Jun Jul Aug Sep Okt Nov Dez AnnualMean
0 1 14.0 18.0 17.0 45.0 22.0 31.0 30.0 4.0 22.0 26.0 12.0 48.0 24.0
1 2 -6.0 17.0 4.0 -14.0 28.0 18.0 -10.0 25.0 -5.0 23.0 5.0 -45.0 3.0
2 3 11.0 -31.0 -13.0 -4.0 -21.0 -12.0 5.0 -22.0 -15.0 -47.0 32.0 25.0 -7.0
3 4 -16.0 46.0 41.0 -7.0 -12.0 -8.0 10.0 32.0 6.0 40.0 -8.0 6.0 11.0
4 5 30.0 -48.0 -37.0 -5.0 32.0 20.0 11.0 -14.0 31.0 -31.0 1.0 4.0 -1.0
5 6 -21.0 17.0 2.0 23.0 -41.0 -7.0 -41.0 9.0 -3.0 18.0 -30.0 12.0 -5.0
6 7 4.0 29.0 15.0 -24.0 33.0 -36.0 4.0 -31.0 -32.0 4.0 0.0 -46.0 -7.0
7 8 25.0 24.0 4.0 26.0 7.0 45.0 17.0 2.0 47.0 17.0 19.0 3.0 20.0
8 9 47.0 36.0 34.0 24.0 17.0 45.0 3.0 32.0 27.0 15.0 46.0 49.0 31.0
9 10 3.0 -21.0 8.0 21.0 -4.0 -36.0 28.0 -22.0 22.0 -14.0 -16.0 -12.0 -3.0
10 1 -28.0 11.0 -10.0 5.0 9.0 21.0 17.0 17.0 -30.0 26.0 14.0 -18.0 3.0
11 2 5.0 19.0 11.0 -43.0 26.0 -17.0 -5.0 -26.0 26.0 -19.0 -33.0 -15.0 -6.0
12 3 -3.0 -41.0 -31.0 -2.0 -38.0 36.0 -19.0 15.0 -35.0 34.0 35.0 21.0 -3.0
13 4 21.0 28.0 9.0 0.0 20.0 -44.0 3.0 7.0 -6.0 -34.0 -25.0 -2.0 -2.0
14 5 -7.0 -4.0 -17.0 3.0 -26.0 15.0 9.0 -10.0 7.0 6.0 -10.0 -12.0 -3.0
15 6 4.0 18.0 24.0 34.0 42.0 23.0 -29.0 -5.0 29.0 16.0 22.0 -10.0 14.0
16 7 0.0 -35.0 9.0 -9.0 -30.0 -16.0 2.0 15.0 2.0 10.0 -4.0 34.0 -2.0
17 8 40.0 27.0 45.0 24.0 28.0 34.0 4.0 10.0 28.0 16.0 41.0 27.0 27.0
18 9 4.0 4.0 1.0 6.0 8.0 34.0 43.0 48.0 10.0 10.0 37.0 29.0 20.0
19 10 35.0 13.0 17.0 17.0 19.0 -2.0 -29.0 -33.0 -2.0 35.0 -9.0 11.0 6.0

Update

According to your comment:

idx = pd.IndexSlice[:, 'Jan':'AnnualMean']
df.loc[idx] = df.loc[idx].where(~df['ID'].between(1, 8), other=df.loc[idx]).diff().fillna(df)

fill row value in column A with previous row value in column B in pandas

import numpy as np
def handle_group_reg(group):
# find the column is null to handle schedule_from and schedule_to is same
cond_is_from_null = group['schedule_from'].isnull()
cond_is_to_null = group['schedule_to' ].isnull()

# fill schedule_from with previous schedule_to
group['schedule_from'] = group['schedule_from'].combine_first(group['schedule_to'].shift(1))

# fill schedule_to with next schedule_from
group['schedule_to'] = group['schedule_to'].combine_first(group['schedule_from'].shift(-1))

# handle schedule_from and schedule_to is same
cond_is_from_to_same = group['schedule_from'] == group['schedule_to']
group.loc[(cond_is_from_null & cond_is_from_to_same), 'schedule_from'] = np.nan
group.loc[(cond_is_to_null & cond_is_from_to_same), 'schedule_to' ] = np.nan

return group


nan = np.nan
df = pd.DataFrame([{'reg': 'X-346', 'schedule_from': 'CAN', 'schedule_to': 'SHE'},
{'reg': 'X-346', 'schedule_from': nan, 'schedule_to': 'ZUH'},
{'reg': 'X-346', 'schedule_from': nan, 'schedule_to': 'SHA'},
{'reg': 'X-346', 'schedule_from': 'SHA', 'schedule_to': 'PEK'},
{'reg': 'X-346', 'schedule_from': 'PEK', 'schedule_to': nan},
{'reg': 'X-346', 'schedule_from': 'XMN', 'schedule_to': 'SHA'},
{'reg': 'A-583', 'schedule_from': 'CTU', 'schedule_to': nan},
{'reg': 'A-583', 'schedule_from': 'XMN', 'schedule_to': 'SZX'},
{'reg': 'T-777', 'schedule_from': 'SHA', 'schedule_to': nan},
{'reg': 'T-777', 'schedule_from': 'SHA', 'schedule_to': 'CVG'}])
dfn = df.groupby('reg').apply(handle_group_reg)
print(dfn.fillna(0))

# reg schedule_from schedule_to
# 0 X-346 CAN SHE
# 1 X-346 SHE ZUH
# 2 X-346 ZUH SHA
# 3 X-346 SHA PEK
# 4 X-346 PEK XMN
# 5 X-346 XMN SHA
# 6 A-583 CTU XMN
# 7 A-583 XMN SZX
# 8 T-777 SHA 0
# 9 T-777 SHA CVG

Pandas extract previous row on value change

Try shift(-1):

df[df.part_no_2 != df.part_no_2.shift(-1)]

part_no_2 qty
2 22 4
4 23 22
5 24 0
8 25 5
9 26 6

Python: How to iterate over rows and calculate value based on previous row

You could do it like this:

import pandas as pd
test = pd.DataFrame({'Country':['USA','USA','USA','USA','USA'],
'Month':[6,7,8,9,10],
'Sales':[100,200,0,0,0],
'Recovery':[0,1,1.5,2.5,3]
})

test['Prediction'] = test['Sales']
for i in range(1, len(test)):
#prevent division by zero
if test.loc[i-1, 'Recovery'] != 0:
test.loc[i, 'Prediction'] = test.loc[i-1, 'Prediction'] * test.loc[i, 'Recovery'] / test.loc[i-1, 'Recovery']


Related Topics



Leave a reply



Submit