Calculate Difference Between Values in Consecutive Rows by Group

Calculate difference between values in consecutive rows by group

The package data.table can do this fairly quickly, using the shift function.

require(data.table)
df <- data.table(group = rep(c(1, 2), each = 3), value = c(10,20,25,5,10,15))
#setDT(df) #if df is already a data frame

df[ , diff := value - shift(value), by = group]    
#   group value diff
#1:     1    10   NA
#2:     1    20   10
#3:     1    25    5
#4:     2     5   NA
#5:     2    10    5
#6:     2    15    5
setDF(df) #if you want to convert back to old data.frame syntax

Or using the lag function in dplyr

df %>%
    group_by(group) %>%
    mutate(Diff = value - lag(value))
#   group value  Diff
#   <int> <int> <int>
# 1     1    10    NA
# 2     1    20    10
# 3     1    25     5
# 4     2     5    NA
# 5     2    10     5
# 6     2    15     5

For alternatives pre-data.table::shift and pre-dplyr::lag, see edits.

Calculating the difference between consecutive rows by group using dplyr?

Like this:

dat %>% 
  group_by(id) %>% 
  mutate(time.difference = time - lag(time))

pandas groupby dataframes, calculate diffs between consecutive rows

You can use groupby

s2= df.groupby(['cycleID'])['mean'].diff()
s2.dropna(inplace=True)

output

1   -8.453876e-12
3   -1.486037e-11
5   2.482933e-12
7   -3.388330e-12
8   3.000000e-12

UPDATE

d = [[1, 1.5020712104685252e-11],
[1, 6.56683605063102e-12],
[2, 1.3993315187144084e-11],
[2, -8.670502467042485e-13],
[3, 7.0270625256163566e-12],
[3, 9.509995221868016e-12],
[4, 1.2901435995915644e-11],
[4, 9.513106448422182e-12]]

df = pd.DataFrame(d, columns=['cycleID', 'mean'])



df2 = df.groupby(['cycleID']).diff().dropna().rename(columns={'mean': 'difference'})
df2['mean'] = df['mean'].iloc[df2.index]

       difference    mean
1   -8.453876e-12   6.566836e-12
3   -1.486037e-11   -8.670502e-13
5   2.482933e-12    9.509995e-12
7   -3.388330e-12   9.513106e-12

difference between rows within groups pandas

See if this helps:

#replaces any value that contains a string value, with a 0
df['col2'] = pd.to_numeric(df.col2, errors='coerce').fillna(0)
#sorts the column in ascending first and calculates the difference 
df['diff']=df.sort_values(['col1','col2'],ascending=[1,1]).groupby('col1').diff()
#display the dataframe after sorting col1 in asc and col2 in desc
df.sort_values(['col1','col2'],ascending=[1,0])

Out:

Sample Image

computing datetime difference among consecutive rows in groupby DataFrame

Use GroupBy.diff and divide by a 1 month timedelta.

df['months'] = df.groupby('name')['date'].diff().div(pd.Timedelta(days=30.44), fill_value=0).round().astype(int)

output

    name       date  months
0   Mark 2018-01-01       0
1   Anne 2018-01-01       0
2   Anne 2018-02-01       1
3   Anne 2018-04-01       2
4   Anne 2018-09-01       5
5   Anne 2019-01-01       4
6   John 2018-02-01       0
7   John 2018-06-01       4
8   John 2019-02-01       8
9  Ethan 2018-03-01       0

Is there a function that will allow to get the difference between rows of the same type?

You can use diff in ave.

df$diff <- ave(df$y, df$x, FUN=function(z) c(diff(z), NA))

df
#            x    y diff
#1  Jimmy Page    1    1
#2  Jimmy Page    2    1
#3  Jimmy Page    3    1
#4  Jimmy Page    4   NA
#5  John Smith    5    2
#6  John Smith    7   82
#7  John Smith   89   NA
#8    Joe Root   12   22
#9    Joe Root   34   33
#10   Joe Root   67   28
#11   Joe Root   95 9579
#12   Joe Root 9674   NA

Difference between consecutive row groups using data table in R

We can get the diff of 'weight' grouped by by 'Game', multiply by -1 and concatenate both the values.

dt_in[, want := {v1 <- diff(weight); list(c(-v1, v1))} , by = Game]

dt_in
#    Game ID weight want
#1:    1  1    150   30
#2:    1  2    120  -30
#3:    2  3    151   -9
#4:    2  4    160    9   
#5:    3  5    190   20
#6:    3  6    170  -20
#7:    4  7    170    0
#8:    4  8    170    0

Or a compact option by @Frank

dt_in[, want := c(-1,1)*diff(weight), by=Game]