Calculating Difference Between Two Rows in Python/Pandas

Calculating difference between two rows in Python / Pandas

I think you want to do something like this:

In [26]: data
Out[26]:
Date Close Adj Close
251 2011-01-03 147.48 143.25
250 2011-01-04 147.64 143.41
249 2011-01-05 147.05 142.83
248 2011-01-06 148.66 144.40
247 2011-01-07 147.93 143.69

In [27]: data.set_index('Date').diff()
Out[27]:
Close Adj Close
Date
2011-01-03 NaN NaN
2011-01-04 0.16 0.16
2011-01-05 -0.59 -0.58
2011-01-06 1.61 1.57
2011-01-07 -0.73 -0.71

Pandas: how to sequentially alternate between calculating difference between two rows and skip calculation for the next row?

You can just mask the result

df['B'] = df['A'].diff().mask(df.index%2!=1,0)
df
Out[469]:
A B
0 100 0.0
1 101 1.0
2 103 0.0
3 107 4.0
4 110 0.0
5 120 10.0
6 150 0.0
7 170 20.0

Or we do groupby

df['B'] = df.groupby(df.index//2).A.diff().fillna(0)
Out[472]:
0 0.0
1 1.0
2 0.0
3 4.0
4 0.0
5 10.0
6 0.0
7 20.0
Name: A, dtype: float64

Pandas: Calculate Difference between a row and all other rows and create column with the name

You want the pairwise absolute difference of the sum of the values for each row. The easiest might be to use the underlying numpy array.

absolute difference of the sum of the "value" columns

# get sum of values per row and convert to numpy array
a = df['value1'].filter(regex='(?i)value').sum(1).to_numpy()

# compute the pairwise difference, create a DataFrame and join
df2 = df.join(pd.DataFrame(abs(a-a[:,None]), columns=df['Name'], index=df.index))

output:

      Name  value1  Value2 finallist  cosmos  network  unab
0 cosmos 10 20 [10, 20] 0 40 30
1 network 30 40 [30, 40] 40 0 10
2 unab 20 40 [20, 40] 30 10 0

Calculate difference between cells in different rows in a pandas Dataframe

Groupby 'ID' and calculate difference and then assign back to df:

df[['X diff','Y Diff']]=df.groupby('ID')[['X','Y']].diff()

output of df:

  Timestamp     ID      X        Y      X diff  Y Diff
0 0 100 1.728 14.378 NaN NaN
1 12 100 2.035 14.378 0.307 0.000
2 24 100 2.342 14.378 0.307 0.000
3 36 100 2.630 14.378 0.288 0.000
4 48 100 2.937 14.416 0.307 0.038

calculate the difference between pandas rows in pairs

Why don't you just use sum:

df['price_diff'] = df['Value'].rolling(2).sum()

Although from the name, it looks like

df['price_diff'] = df['Price'].diff()

And, for the two columns:

df[['Date_diff','Price_diff']] = df[['Date','Price']].diff()

Output:

        Date  Qty       Price    Value Date_diff  Price_diff
0 2014-11-18 58 495.775716 -2875499 NaT NaN
1 2014-11-24 -58 484.280147 2808824 6 days -11.495569
2 2014-11-26 63 474.138699 -2987073 2 days -10.141448
3 2014-12-31 -63 507.931247 3199966 35 days 33.792548
4 2015-01-05 59 495.923771 -2925950 5 days -12.007476
5 2015-02-05 -59 456.224370 2691723 31 days -39.699401

Updated Per comment, you can try:

df['Val_sum'] = df['Value'].rolling(2).sum()[1::2]

Output:

        Date  Qty       Price    Value   Val_sum
0 2014-11-18 58 495.775716 -2875499 NaN
1 2014-11-24 -58 484.280147 2808824 -66675.0
2 2014-11-26 63 474.138699 -2987073 NaN
3 2014-12-31 -63 507.931247 3199966 212893.0
4 2015-01-05 59 495.923771 -2925950 NaN
5 2015-02-05 -59 456.224370 2691723 -234227.0

How to calculate difference between rows in Pandas DataFrame?

Consider shift to create adjacent columns, w2, x2, y2, z2, of next row values then run rowwise apply which does require axis='columns' (not index):

df[[col+'2' for col in list('wxyz')]] = df[['x', 'y', 'z', 'w']].shift(-1)

def quaternion_distances(row):

""" Create two Quaternions objects and calculate 3 distances between them """
q1 = Quaternion(row['w'], row['x'], row['y'], row['z'])
q2 = Quaternion(row['w2'], row['x2'], row['y2'], row['z2'])

row['dist_by_signal'] = Quaternion.absolute_distance(q1, q2)
row['dist_geodesic'] = Quaternion.distance(q1, q2)
row['dist_sim_geodec'] = Quaternion.sym_distance(q1, q2)

return row

df = df.apply(quaternion_distances, axis='columns')

print(df)

Pandas calculating difference between rows

By using apply with .iloc

df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
Out[187]:
0 4.0
1 NaN
2 5.0
3 NaN
4 6.0
5 4.0
6 NaN
Name: Day, dtype: float64

After assign it back

df['Length']=df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
df
Out[189]:
Id Day Status Length
0 111 1 Start 4.0
1 111 5 End NaN
2 222 2 Begin 5.0
3 222 7 End NaN
4 333 1 Start 6.0
5 333 3 Begin 4.0
6 333 7 End NaN

How to calculate differences between consecutive rows in pandas data frame?

diff should give the desired result:

>>> df.diff()
count_a count_b
2015-01-01 NaN NaN
2015-01-02 38465 NaN
2015-01-03 36714 NaN
2015-01-04 35137 NaN
2015-01-05 35864 NaN
....
2015-02-07 142390 25552
2015-02-08 126768 22835
2015-02-09 122324 21485


Related Topics



Leave a reply



Submit