get previous row's value and calculate new column pandas python
The way to get the previous is using the shift method:
In [11]: df1.change.shift(1)
Out[11]:
0 NaT
1 2014-03-08
2 2014-04-08
3 2014-05-08
4 2014-06-08
Name: change, dtype: datetime64[ns]
Now you can subtract these columns. Note: This is with 0.13.1 (datetime stuff has had a lot of work recently, so YMMV with older versions).
In [12]: df1.change.shift(1) - df1.change
Out[12]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 0 days
Name: change, dtype: timedelta64[ns]
You can just apply this to each case/group:
In [13]: df.groupby('case')['change'].apply(lambda x: x.shift(1) - x)
Out[13]:
0 NaT
1 -31 days
2 -30 days
3 -31 days
4 NaT
dtype: timedelta64[ns]
Is there a way in Pandas to use previous row value in dataframe.apply when previous value is also calculated in the apply?
First, create the derived value:
df.loc[0, 'C'] = df.loc[0, 'D']
Then iterate through the remaining rows and fill the calculated values:
for i in range(1, len(df)):
df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']
Index_Date A B C D
0 2015-01-31 10 10 10 10
1 2015-02-01 2 3 23 22
2 2015-02-02 10 60 290 280
How to create a new column based on row value in previous row in Pandas dataframe?
You can first sort_values
on your date column to make sure they are in the right order to perform the comparison, and then you can use np.where
with shift()
to compare the previous value in Adj Close with the current one:
# Sort by date
df.sort_values(by='Date',ascending=True)
# Create a column comparing previous Adj Close with current Adj Close
import numpy as np
df['i'] = np.where(df['Adj Close'].shift(1) < df['Adj Close'],1,0)
df
Date Open High ... Adj Close Volume i
index ...
1297 2021-03-01 104.540001 133.990005 ... 120.40 49597300 0
1298 2021-03-02 116.930000 133.199900 ... 118.18 33640400 0
1299 2021-03-03 122.500000 127.700000 ... 124.18 19173700 1
Add previous row value to current row and so on Python
Use cumsum
df['tvd'] = df['tvd'].cumsum()
Example:
import pandas as pd
import numpy as np
from io import StringIO
txt = """ MD Incl. Azi.
0 0.00 0.00 350.00
1 161.00 0.00 350.00
2 261.00 0.00 350.00
3 361.00 0.00 350.00
4 461.00 0.00 350.00"""
df = pd.read_csv(StringIO(txt), sep='\s\s+')
for i in range (1, len(df)):
incl = np.deg2rad(df['Incl.'])
df['TVD_diff'] = (((df['MD'] - df['MD'].shift())/2)*(np.cos(incl).shift() + np.cos(incl)))
df['TVD_diff'] = df['TVD_diff'].cumsum()
print(df)
Output:
MD Incl. Azi. TVD_diff
0 0.0 0.0 350.0 NaN
1 161.0 0.0 350.0 161.0
2 261.0 0.0 350.0 261.0
3 361.0 0.0 350.0 361.0
4 461.0 0.0 350.0 461.0
How can i create a new dataframe with cell values based on the previous row for each column?
You can calculate the cumulative product of the rows after the first using .cumprod()
. Here I take the second row onwards, add 1 to these and calculate the cumulative product. I then multiply this by the first row.
(df.iloc[1:]+1).cumprod() * df.iloc[0]
And then concatenate the first row of your dataframe df.head(1)
with the calculated dataframe using pd.concat()
:
pd.concat([df.head(1), ((df.iloc[1:]+1).cumprod() * df.iloc[0])], ignore_index=True)
This can be split in to parts:
# calculation
df2 = (df.iloc[1:]+1).cumprod() * df.iloc[0]
# concatenate the first row of df with the calculation
pd.concat([df.head(1), df2], ignore_index=True)
Pandas- Create column based on sum of previous row values
Are you sure the expected output is correct?
I would do:
df['sum'] = df.groupby('id')['val'].rolling(min_periods=1, window=3).sum().values
output:
id val sum
0 5 1 1.0
1 5 0 1.0
2 5 4 5.0
3 5 6 10.0
4 5 2 12.0
5 5 3 11.0
6 9 0 0.0
7 9 1 1.0
8 9 6 7.0
9 9 2 9.0
10 9 4 12.0
Related Topics
How to Transpose Dataframe in Pandas Without Index
Query for List of Attribute Instead of Tuples in SQLalchemy
How to Flatten Lists Without Splitting Strings
Reactornotrestartable Error in While Loop with Scrapy
Convert Datetime to Unix Timestamp and Convert It Back in Python
Assignment Inside Lambda Expression in Python
List All the Modules That Are Part of a Python Package
Bin Elements Per Row - Vectorized 2D Bincount for Numpy
Changing Iteration Variable Inside for Loop in Python
I Expect 'True' But Get 'None'
Different Ways of Clearing Lists
Error Installing Psycopg2, Library Not Found for -Lssl
How to Specify Your Own Distance Function Using Scikit-Learn K-Means Clustering
What's the Best Practice Using a Settings File in Python
Concatenate Numpy Arrays Without Copying
Split String into Strings by Length
Does Python Have a Stack/Heap and How Is Memory Managed
How to Get the Difference Between Two Dictionaries in Python