How to Find the Difference in Value in Every Two Consecutive Rows in R

Calculate difference between values in consecutive rows by group

The package data.table can do this fairly quickly, using the shift function.

require(data.table)
df <- data.table(group = rep(c(1, 2), each = 3), value = c(10,20,25,5,10,15))
#setDT(df) #if df is already a data frame

df[ , diff := value - shift(value), by = group]
# group value diff
#1: 1 10 NA
#2: 1 20 10
#3: 1 25 5
#4: 2 5 NA
#5: 2 10 5
#6: 2 15 5
setDF(df) #if you want to convert back to old data.frame syntax

Or using the lag function in dplyr

df %>%
group_by(group) %>%
mutate(Diff = value - lag(value))
# group value Diff
# <int> <int> <int>
# 1 1 10 NA
# 2 1 20 10
# 3 1 25 5
# 4 2 5 NA
# 5 2 10 5
# 6 2 15 5

For alternatives pre-data.table::shift and pre-dplyr::lag, see edits.

R programming: How to find a difference in value for every two consecutive dates, given a specific ID

## Order data.frame by IDs, then by increasing sleep_end_dates (if not already sorted)
df <- df[order(df$ID, df$sleep_end_date),]

## Calculate difference in total_sleep with previous entry
df$diff_hours_of_sleep <- c(NA,abs(diff(df$total_sleep)))

## If previous ID is not equal, replace diff_hours_of_sleep with NA
ind <- c(NA, diff(df$ID))
df$diff_hours_of_sleep[ind != 0] <- NA

## And if previous day wasn't yesterday, replace diff_hours_of_sleep with NA
day_ind <- c(NA, diff(df$sleep_end_date))
df$diff_hours_of_sleep[day_ind != 1] <- NA

compute the difference of two values within 1 column

The solution is quite straightforward iff, as your sample suggests, you always have 2 values for each subject:

library(dplyr)
df %>%
group_by(Subject) %>%
mutate(Diff = lead(Response_time) - Response_time) %>%
fill(Diff)
# A tibble: 6 × 3
# Groups: Subject [3]
Subject Response_time Diff
<chr> <dbl> <dbl>
1 Jeff 1000 2000
2 Jeff 3000 2000
3 Amy 2000 11000
4 Amy 13000 11000
5 Ed 1500 300
6 Ed 1800 300

Data:

df <- data.frame(
Subject = c("Jeff","Jeff","Amy","Amy","Ed","Ed"),
Response_time = c(1000,3000,2000,13000,1500,1800)
)

Calculating the difference between consecutive rows by group using dplyr?

Like this:

dat %>% 
group_by(id) %>%
mutate(time.difference = time - lag(time))

Get the difference between two non consecutive rows

This should do the trick. However the last few rows are omitted from the output so the matrix obtained is smaller than your input.

diff(as.matrix(your_data_frame), lag = 3)


Related Topics



Leave a reply



Submit