Get Previous Row's Value and Calculate New Column Pandas Python

get previous row's value and calculate new column pandas python

The way to get the previous is using the shift method:

In [11]: df1.change.shift(1)
Out[11]:
0 NaT
1 2014-03-08
2 2014-04-08
3 2014-05-08
4 2014-06-08
Name: change, dtype: datetime64[ns]

Now you can subtract these columns. Note: This is with 0.13.1 (datetime stuff has had a lot of work recently, so YMMV with older versions).

In [12]: df1.change.shift(1) - df1.change
Out[12]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 0 days
Name: change, dtype: timedelta64[ns]

You can just apply this to each case/group:

In [13]: df.groupby('case')['change'].apply(lambda x: x.shift(1) - x)
Out[13]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 NaT
dtype: timedelta64[ns]

Is there a way in Pandas to use previous row value in dataframe.apply when previous value is also calculated in the apply?

First, create the derived value:

df.loc[0, 'C'] = df.loc[0, 'D']

Then iterate through the remaining rows and fill the calculated values:

for i in range(1, len(df)):
df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']

Index_Date A B C D
0 2015-01-31 10 10 10 10
1 2015-02-01 2 3 23 22
2 2015-02-02 10 60 290 280

How to create a new column based on row value in previous row in Pandas dataframe?

You can first sort_values on your date column to make sure they are in the right order to perform the comparison, and then you can use np.where with shift() to compare the previous value in Adj Close with the current one:

# Sort by date
df.sort_values(by='Date',ascending=True)

# Create a column comparing previous Adj Close with current Adj Close
import numpy as np
df['i'] = np.where(df['Adj Close'].shift(1) < df['Adj Close'],1,0)


df
Date Open High ... Adj Close Volume i
index ...
1297 2021-03-01 104.540001 133.990005 ... 120.40 49597300 0
1298 2021-03-02 116.930000 133.199900 ... 118.18 33640400 0
1299 2021-03-03 122.500000 127.700000 ... 124.18 19173700 1

Add previous row value to current row and so on Python

Use cumsum

df['tvd'] = df['tvd'].cumsum()

Example:

import pandas as pd
import numpy as np
from io import StringIO

txt = """ MD Incl. Azi.
0 0.00 0.00 350.00
1 161.00 0.00 350.00
2 261.00 0.00 350.00
3 361.00 0.00 350.00
4 461.00 0.00 350.00"""

df = pd.read_csv(StringIO(txt), sep='\s\s+')

for i in range (1, len(df)):
incl = np.deg2rad(df['Incl.'])
df['TVD_diff'] = (((df['MD'] - df['MD'].shift())/2)*(np.cos(incl).shift() + np.cos(incl)))

df['TVD_diff'] = df['TVD_diff'].cumsum()

print(df)

Output:

      MD  Incl.   Azi.  TVD_diff
0 0.0 0.0 350.0 NaN
1 161.0 0.0 350.0 161.0
2 261.0 0.0 350.0 261.0
3 361.0 0.0 350.0 361.0
4 461.0 0.0 350.0 461.0

How can i create a new dataframe with cell values based on the previous row for each column?

You can calculate the cumulative product of the rows after the first using .cumprod(). Here I take the second row onwards, add 1 to these and calculate the cumulative product. I then multiply this by the first row.

(df.iloc[1:]+1).cumprod() * df.iloc[0]

And then concatenate the first row of your dataframe df.head(1) with the calculated dataframe using pd.concat():

pd.concat([df.head(1), ((df.iloc[1:]+1).cumprod() * df.iloc[0])], ignore_index=True)

This can be split in to parts:

# calculation
df2 = (df.iloc[1:]+1).cumprod() * df.iloc[0]
# concatenate the first row of df with the calculation
pd.concat([df.head(1), df2], ignore_index=True)

Pandas- Create column based on sum of previous row values

Are you sure the expected output is correct?

I would do:

df['sum'] = df.groupby('id')['val'].rolling(min_periods=1, window=3).sum().values

output:

    id  val   sum
0 5 1 1.0
1 5 0 1.0
2 5 4 5.0
3 5 6 10.0
4 5 2 12.0
5 5 3 11.0
6 9 0 0.0
7 9 1 1.0
8 9 6 7.0
9 9 2 9.0
10 9 4 12.0


Related Topics



Leave a reply



Submit