Calculating Difference Between Two Rows in Python/Pandas

Calculating difference between two rows in Python / Pandas

I think you want to do something like this:

In [26]: data
Out[26]: 
           Date   Close  Adj Close
251  2011-01-03  147.48     143.25
250  2011-01-04  147.64     143.41
249  2011-01-05  147.05     142.83
248  2011-01-06  148.66     144.40
247  2011-01-07  147.93     143.69

In [27]: data.set_index('Date').diff()
Out[27]: 
            Close  Adj Close
Date                        
2011-01-03    NaN        NaN
2011-01-04   0.16       0.16
2011-01-05  -0.59      -0.58
2011-01-06   1.61       1.57
2011-01-07  -0.73      -0.71

Pandas: how to sequentially alternate between calculating difference between two rows and skip calculation for the next row?

You can just mask the result

df['B'] = df['A'].diff().mask(df.index%2!=1,0)
df
Out[469]: 
     A     B
0  100   0.0
1  101   1.0
2  103   0.0
3  107   4.0
4  110   0.0
5  120  10.0
6  150   0.0
7  170  20.0

Or we do groupby

df['B'] = df.groupby(df.index//2).A.diff().fillna(0)
Out[472]: 
0     0.0
1     1.0
2     0.0
3     4.0
4     0.0
5    10.0
6     0.0
7    20.0
Name: A, dtype: float64

Pandas: Calculate Difference between a row and all other rows and create column with the name

You want the pairwise absolute difference of the sum of the values for each row. The easiest might be to use the underlying numpy array.

absolute difference of the sum of the "value" columns

# get sum of values per row and convert to numpy array
a = df['value1'].filter(regex='(?i)value').sum(1).to_numpy()

# compute the pairwise difference, create a DataFrame and join
df2 = df.join(pd.DataFrame(abs(a-a[:,None]), columns=df['Name'], index=df.index))

output:

      Name  value1  Value2 finallist  cosmos  network  unab
0   cosmos      10      20  [10, 20]       0       40    30
1  network      30      40  [30, 40]      40        0    10
2     unab      20      40  [20, 40]      30       10     0

Calculate difference between cells in different rows in a pandas Dataframe

Groupby 'ID' and calculate difference and then assign back to df:

df[['X diff','Y Diff']]=df.groupby('ID')[['X','Y']].diff()

output of df:

  Timestamp     ID      X        Y      X diff  Y Diff
0   0           100     1.728   14.378  NaN     NaN
1   12          100     2.035   14.378  0.307   0.000
2   24          100     2.342   14.378  0.307   0.000
3   36          100     2.630   14.378  0.288   0.000
4   48          100     2.937   14.416  0.307   0.038

calculate the difference between pandas rows in pairs

Why don't you just use sum:

df['price_diff'] = df['Value'].rolling(2).sum()

Although from the name, it looks like

df['price_diff'] = df['Price'].diff()

And, for the two columns:

df[['Date_diff','Price_diff']] = df[['Date','Price']].diff()

Output:

        Date  Qty       Price    Value Date_diff  Price_diff
0 2014-11-18   58  495.775716 -2875499       NaT         NaN
1 2014-11-24  -58  484.280147  2808824    6 days  -11.495569
2 2014-11-26   63  474.138699 -2987073    2 days  -10.141448
3 2014-12-31  -63  507.931247  3199966   35 days   33.792548
4 2015-01-05   59  495.923771 -2925950    5 days  -12.007476
5 2015-02-05  -59  456.224370  2691723   31 days  -39.699401

Updated Per comment, you can try:

df['Val_sum'] = df['Value'].rolling(2).sum()[1::2]

Output:

        Date  Qty       Price    Value   Val_sum
0 2014-11-18   58  495.775716 -2875499       NaN
1 2014-11-24  -58  484.280147  2808824  -66675.0
2 2014-11-26   63  474.138699 -2987073       NaN
3 2014-12-31  -63  507.931247  3199966  212893.0
4 2015-01-05   59  495.923771 -2925950       NaN
5 2015-02-05  -59  456.224370  2691723 -234227.0

How to calculate difference between rows in Pandas DataFrame?

Consider shift to create adjacent columns, w2, x2, y2, z2, of next row values then run rowwise apply which does require axis='columns' (not index):

df[[col+'2' for col in list('wxyz')]] = df[['x', 'y', 'z', 'w']].shift(-1)

def quaternion_distances(row):

    """ Create two Quaternions objects and calculate 3 distances between them """ 
    q1 = Quaternion(row['w'], row['x'], row['y'], row['z'])
    q2 = Quaternion(row['w2'], row['x2'], row['y2'], row['z2'])

    row['dist_by_signal']  = Quaternion.absolute_distance(q1, q2)
    row['dist_geodesic']   = Quaternion.distance(q1, q2)
    row['dist_sim_geodec'] = Quaternion.sym_distance(q1, q2)

    return row

df = df.apply(quaternion_distances, axis='columns')

print(df)

Pandas calculating difference between rows

By using apply with .iloc

df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
Out[187]: 
0    4.0
1    NaN
2    5.0
3    NaN
4    6.0
5    4.0
6    NaN
Name: Day, dtype: float64

After assign it back

df['Length']=df.groupby('Id').Day.apply(lambda x : x.iloc[-1]-x).replace(0,np.nan)
df
Out[189]: 
    Id  Day Status  Length
0  111    1  Start     4.0
1  111    5    End     NaN
2  222    2  Begin     5.0
3  222    7    End     NaN
4  333    1  Start     6.0
5  333    3  Begin     4.0
6  333    7    End     NaN

How to calculate differences between consecutive rows in pandas data frame?

diff should give the desired result:

>>> df.diff()
count_a  count_b
2015-01-01      NaN      NaN
2015-01-02    38465      NaN
2015-01-03    36714      NaN
2015-01-04    35137      NaN
2015-01-05    35864      NaN
....
2015-02-07   142390    25552
2015-02-08   126768    22835
2015-02-09   122324    21485

Calculating Difference Between Two Rows in Python/Pandas