Calculate difference between values in consecutive rows by group
The package data.table
can do this fairly quickly, using the shift
function.
require(data.table)
df <- data.table(group = rep(c(1, 2), each = 3), value = c(10,20,25,5,10,15))
#setDT(df) #if df is already a data frame
df[ , diff := value - shift(value), by = group]
# group value diff
#1: 1 10 NA
#2: 1 20 10
#3: 1 25 5
#4: 2 5 NA
#5: 2 10 5
#6: 2 15 5
setDF(df) #if you want to convert back to old data.frame syntax
Or using the lag
function in dplyr
df %>%
group_by(group) %>%
mutate(Diff = value - lag(value))
# group value Diff
# <int> <int> <int>
# 1 1 10 NA
# 2 1 20 10
# 3 1 25 5
# 4 2 5 NA
# 5 2 10 5
# 6 2 15 5
For alternatives pre-data.table::shift
and pre-dplyr::lag
, see edits.
Calculating the difference between consecutive rows by group using dplyr?
Like this:
dat %>%
group_by(id) %>%
mutate(time.difference = time - lag(time))
pandas groupby dataframes, calculate diffs between consecutive rows
You can use groupby
s2= df.groupby(['cycleID'])['mean'].diff()
s2.dropna(inplace=True)
output
1 -8.453876e-12
3 -1.486037e-11
5 2.482933e-12
7 -3.388330e-12
8 3.000000e-12
UPDATE
d = [[1, 1.5020712104685252e-11],
[1, 6.56683605063102e-12],
[2, 1.3993315187144084e-11],
[2, -8.670502467042485e-13],
[3, 7.0270625256163566e-12],
[3, 9.509995221868016e-12],
[4, 1.2901435995915644e-11],
[4, 9.513106448422182e-12]]
df = pd.DataFrame(d, columns=['cycleID', 'mean'])
df2 = df.groupby(['cycleID']).diff().dropna().rename(columns={'mean': 'difference'})
df2['mean'] = df['mean'].iloc[df2.index]
difference mean
1 -8.453876e-12 6.566836e-12
3 -1.486037e-11 -8.670502e-13
5 2.482933e-12 9.509995e-12
7 -3.388330e-12 9.513106e-12
difference between rows within groups pandas
See if this helps:
#replaces any value that contains a string value, with a 0
df['col2'] = pd.to_numeric(df.col2, errors='coerce').fillna(0)
#sorts the column in ascending first and calculates the difference
df['diff']=df.sort_values(['col1','col2'],ascending=[1,1]).groupby('col1').diff()
#display the dataframe after sorting col1 in asc and col2 in desc
df.sort_values(['col1','col2'],ascending=[1,0])
Out:
computing datetime difference among consecutive rows in groupby DataFrame
Use GroupBy.diff
and divide
by a 1 month timedelta
.
df['months'] = df.groupby('name')['date'].diff().div(pd.Timedelta(days=30.44), fill_value=0).round().astype(int)
output
name date months
0 Mark 2018-01-01 0
1 Anne 2018-01-01 0
2 Anne 2018-02-01 1
3 Anne 2018-04-01 2
4 Anne 2018-09-01 5
5 Anne 2019-01-01 4
6 John 2018-02-01 0
7 John 2018-06-01 4
8 John 2019-02-01 8
9 Ethan 2018-03-01 0
Is there a function that will allow to get the difference between rows of the same type?
You can use diff
in ave
.
df$diff <- ave(df$y, df$x, FUN=function(z) c(diff(z), NA))
df
# x y diff
#1 Jimmy Page 1 1
#2 Jimmy Page 2 1
#3 Jimmy Page 3 1
#4 Jimmy Page 4 NA
#5 John Smith 5 2
#6 John Smith 7 82
#7 John Smith 89 NA
#8 Joe Root 12 22
#9 Joe Root 34 33
#10 Joe Root 67 28
#11 Joe Root 95 9579
#12 Joe Root 9674 NA
Difference between consecutive row groups using data table in R
We can get the diff
of 'weight' grouped by by 'Game', multiply by -1 and concatenate both the values.
dt_in[, want := {v1 <- diff(weight); list(c(-v1, v1))} , by = Game]
dt_in
# Game ID weight want
#1: 1 1 150 30
#2: 1 2 120 -30
#3: 2 3 151 -9
#4: 2 4 160 9
#5: 3 5 190 20
#6: 3 6 170 -20
#7: 4 7 170 0
#8: 4 8 170 0
Or a compact option by @Frank
dt_in[, want := c(-1,1)*diff(weight), by=Game]
Related Topics
Creating a New Column Based on Unique Id With Values in R
Filter a Data Frame According to Minimum and Maximum Values
Easier Way to Use Grepl and Ifelse Across Multiple Columns
Pass a String as Variable Name in Dplyr::Filter
Find Complement of a Data Frame (Anti - Join)
How to Install an R Package from Source
Add Column Which Contains Binned Values of a Numeric Column
Cbind a Dataframe With an Empty Dataframe - Cbind.Fill
Add a Common Legend For Combined Ggplots
How to Replace Negative Values in a Dataframe Column With a Different Value
Splitting a Large Data Frame into Smaller Segments
Installing Rgl on Ubuntu and Mac: X11 Not Found
Combine a List of Data Frames into One Data Frame by Row
Test If a Vector Contains a Given Element