Calculating difference between two rows in Python / Pandas
I think you want to do something like this:
In [26]: data
Out[26]:
Date Close Adj Close
251 2011-01-03 147.48 143.25
250 2011-01-04 147.64 143.41
249 2011-01-05 147.05 142.83
248 2011-01-06 148.66 144.40
247 2011-01-07 147.93 143.69
In [27]: data.set_index('Date').diff()
Out[27]:
Close Adj Close
Date
2011-01-03 NaN NaN
2011-01-04 0.16 0.16
2011-01-05 -0.59 -0.58
2011-01-06 1.61 1.57
2011-01-07 -0.73 -0.71
Pandas: how to sequentially alternate between calculating difference between two rows and skip calculation for the next row?
You can just mask
the result
df['B'] = df['A'].diff().mask(df.index%2!=1,0)
df
Out[469]:
A B
0 100 0.0
1 101 1.0
2 103 0.0
3 107 4.0
4 110 0.0
5 120 10.0
6 150 0.0
7 170 20.0
Or we do groupby
df['B'] = df.groupby(df.index//2).A.diff().fillna(0)
Out[472]:
0 0.0
1 1.0
2 0.0
3 4.0
4 0.0
5 10.0
6 0.0
7 20.0
Name: A, dtype: float64
Pandas: Calculate Difference between a row and all other rows and create column with the name
You want the pairwise absolute difference of the sum of the values for each row. The easiest might be to use the underlying numpy array.
absolute difference of the sum of the "value" columns
# get sum of values per row and convert to numpy array
a = df['value1'].filter(regex='(?i)value').sum(1).to_numpy()
# compute the pairwise difference, create a DataFrame and join
df2 = df.join(pd.DataFrame(abs(a-a[:,None]), columns=df['Name'], index=df.index))
output:
Name value1 Value2 finallist cosmos network unab
0 cosmos 10 20 [10, 20] 0 40 30
1 network 30 40 [30, 40] 40 0 10
2 unab 20 40 [20, 40] 30 10 0
Calculate difference between cells in different rows in a pandas Dataframe
Groupby 'ID' and calculate difference and then assign back to df:
df[['X diff','Y Diff']]=df.groupby('ID')[['X','Y']].diff()
output of df
:
Timestamp ID X Y X diff Y Diff
0 0 100 1.728 14.378 NaN NaN
1 12 100 2.035 14.378 0.307 0.000
2 24 100 2.342 14.378 0.307 0.000
3 36 100 2.630 14.378 0.288 0.000
4 48 100 2.937 14.416 0.307 0.038
calculate the difference between pandas rows in pairs
Why don't you just use sum
:
df['price_diff'] = df['Value'].rolling(2).sum()
Although from the name, it looks like
df['price_diff'] = df['Price'].diff()
And, for the two columns:
df[['Date_diff','Price_diff']] = df[['Date','Price']].diff()
Output:
Date Qty Price Value Date_diff Price_diff
0 2014-11-18 58 495.775716 -2875499 NaT NaN
1 2014-11-24 -58 484.280147 2808824 6 days -11.495569
2 2014-11-26 63 474.138699 -2987073 2 days -10.141448
3 2014-12-31 -63 507.931247 3199966 35 days 33.792548
4 2015-01-05 59 495.923771 -2925950 5 days -12.007476
5 2015-02-05 -59 456.224370 2691723 31 days -39.699401
Updated Per comment, you can try:
df['Val_sum'] = df['Value'].rolling(2).sum()[1::2]
Output:
Date Qty Price Value Val_sum
0 2014-11-18 58 495.775716 -2875499 NaN
1 2014-11-24 -58 484.280147 2808824 -66675.0
2 2014-11-26 63 474.138699 -2987073 NaN
3 2014-12-31 -63 507.931247 3199966 212893.0
4 2015-01-05 59 495.923771 -2925950 NaN
5 2015-02-05 -59 456.224370 2691723 -234227.0
How to calculate difference between rows in Pandas DataFrame?
Consider shift
to create adjacent columns, w2, x2, y2, z2, of next row values then run rowwise apply
which does require axis='columns'
(not index
):
df[[col+'2' for col in list('wxyz')]] = df[['x', 'y', 'z', 'w']].shift(-1)
def quaternion_distances(row):
""" Create two Quaternions objects and calculate 3 distances between them """
q1 = Quaternion(row['w'], row['x'], row['y'], row['z'])
q2 = Quaternion(row['w2'], row['x2'], row['y2'], row['z2'])
row['dist_by_signal'] = Quaternion.absolute_distance(q1, q2)
row['dist_geodesic'] = Quaternion.distance(q1, q2)
row['dist_sim_geodec'] = Quaternion.sym_distance(q1, q2)
return row
df = df.apply(quaternion_distances, axis='columns')
print(df)
Pandas calculating difference between rows
By using apply
with .iloc
df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
Out[187]:
0 4.0
1 NaN
2 5.0
3 NaN
4 6.0
5 4.0
6 NaN
Name: Day, dtype: float64
After assign it back
df['Length']=df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
df
Out[189]:
Id Day Status Length
0 111 1 Start 4.0
1 111 5 End NaN
2 222 2 Begin 5.0
3 222 7 End NaN
4 333 1 Start 6.0
5 333 3 Begin 4.0
6 333 7 End NaN
How to calculate differences between consecutive rows in pandas data frame?
diff
should give the desired result:
>>> df.diff()
count_a count_b
2015-01-01 NaN NaN
2015-01-02 38465 NaN
2015-01-03 36714 NaN
2015-01-04 35137 NaN
2015-01-05 35864 NaN
....
2015-02-07 142390 25552
2015-02-08 126768 22835
2015-02-09 122324 21485
Related Topics
Iterate a List with Indexes in Python
How to Do Assignments in a List Comprehension
Subprocess.Popen() Error (No Such File or Directory) When Calling Command with Arguments as a String
Type Hint for a Function That Returns Only a Specific Set of Values
Differences Between Numpy.Random and Random.Random in Python
How to Plot Nan Values as a Special Color with Imshow in Matplotlib
Underscore VS Double Underscore with Variables and Methods
Can Existing Virtualenv Be Upgraded Gracefully
How to Apply Gradient Clipping in Tensorflow
Sqlite Insert Query Not Working with Python
How to Uninstall a Package Installed with Pip Install --User
What Is the Recommended Way of Allocating Memory for a Typed Memory View
Sorting Python List Based on the Length of the String