R Subtract Value for the Same Id (From the First Id That Shows)

R subtract value for the same ID (from the first ID that shows)

Let's assume that dates are already sorted. I would probably retrieve the first value for each id and then use this to compute the diff feature.
Something like this.

my.df <- data.frame(
id = c(2380, 2380, 2380, 2380, 20100,20100,20100, 20100, 20103, 20103),
date = c("10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12"),
value = c(21.01, 22.04, 22.65, 23.11, 35.21, 37.07, 38.17, 38.97, 57.98, 60.83),
stringsAsFactors = F)
#
# get ids
my.ids <- unique(my.df$id) # or levels(my.df$id)

# get first val (assuming sorting by date)
id.val0 <- sapply(my.ids, (function(id){
my.df$value[my.df$id == id][1]
}))
names(id.val0) <- my.ids

# do operation
my.df$diff <- sapply(1:nrow(my.df), (function(i){
tmp.id <- my.df$id[i]
my.df$value[i] - id.val0[as.character(tmp.id)]
}))

How to subtract value from previous observation ID?

Another option is match

df1$ValueLessPrevious <- with(df1, Value - 
Value[match(PreviousObsID, ObservationID)])
df1$ValueLessPrevious
#[1] 25 -35 -240 NA

Subtract a value based on the first instance of a group in another column

You may try

library(dplyr)
library(data.table)

df %>%
group_by(data.table::rleid( category)) %>%
mutate(ctime = cumsum(time)) %>%
mutate(val1 = ifelse(startsWith(category, "A"),ctime - 200, ctime )) %>%
filter(val1>0) %>%
mutate(time = val1 - ifelse(is.na(lag(val1)), 0, lag(val1))) %>%
ungroup %>%
select(time, look, category)

time look category
<dbl> <chr> <chr>
1 150 left B1
2 170 right B1
3 100 left B1
4 370 right A1
5 100 left A1
6 100 right A2
7 100 left A2
8 100 right A2
9 100 left B1
10 150 right B1
11 200 away B1
12 100 left B1

subtract first or second value from each row

We can use the first from dplyr

test %>%
group_by(two) %>%
mutate(new=three- first(three))
# A tibble: 6 x 4
# Groups: two [2]
# one two three new
# <chr> <chr> <int> <int>
#1 c a 1 0
#2 d a 2 1
#3 e a 3 2
#4 c b 4 0
#5 d b 5 1
#6 e b 6 2

If we are subsetting the 'three' values based on string "c" in 'one', then we don't need .$ as it will get the whole column 'c' instead of the values within the group by column

test %>% 
group_by(`two`) %>%
mutate(new=three-three[one=="c"])

subtract value from previous row by group

With dplyr:

library(dplyr)

data %>%
group_by(id) %>%
arrange(date) %>%
mutate(diff = value - lag(value, default = first(value)))

For clarity you can arrange by date and grouping column (as per comment by lawyer)

data %>%
group_by(id) %>%
arrange(date, .by_group = TRUE) %>%
mutate(diff = value - lag(value, default = first(value)))

or lag with order_by:

data %>%
group_by(id) %>%
mutate(diff = value - lag(value, default = first(value), order_by = date))

With data.table:

library(data.table)

dt <- as.data.table(data)
setkey(dt, id, date)
dt[, diff := value - shift(value, fill = first(value)), by = id]

Subtract previous rows from first row by ID in R

First create a cumulative sum on the original DRSG column grouped by ID. We can then use this new column in conjunction with the original ID as our group. From there we leverage dplyr::first to do the subtraction.

library(tidyverse)

data %>%
mutate(Date = as.Date(Date)) %>%
group_by(ID) %>%
mutate(increment_DRSG = cumsum(DRSG)) %>%
group_by(ID, increment_DRSG) %>%
mutate(days = Date - first(Date))

Subtract values within group

Well, its a somewhat odd calculation, but slightly to my own surprise, the following seems to do what you explain:


set.seed(42)
ID <- sample(1:15, 100, replace = TRUE)
value <- sample(1:4, 100, replace = TRUE)
d <- data.frame(ID, value)

d %>% group_by( ID ) %>%
mutate(
value_c = value*2 - sum(value)
) %>%
arrange( ID ) %>%
head( n=20 )

Produces:


# A tibble: 20 x 3
# Groups: ID [3]
ID value value_c
<int> <int> <dbl>
1 1 1 -12
2 1 1 -12
3 1 4 -6
4 1 1 -12
5 1 1 -12
6 1 2 -10
7 1 4 -6
8 2 4 -21
9 2 3 -23
10 2 3 -23
11 2 2 -25
12 2 1 -27
13 2 1 -27
14 2 3 -23
15 2 3 -23
16 2 1 -27
17 2 4 -21
18 2 4 -21
19 3 4 -8
20 3 4 -8

You multiply value by 2 because its going to be in the sum() anyway, which you didn't want, so adding it back on the left side takes care of that.

Subtracting one column by another by the first row in every 5 rows in dataframe

This will work, we just need to group by id, then take advantage of the first() function to take the difference versus the first value of x for each group.

library(tidyverse)

df1 %>% group_by(id) %>% mutate(new = x - first(x))

# A tibble: 12 x 5
# Groups: id [2]
visit id x want new
<fct> <chr> <int> <dbl> <int>
1 scr 101 30 0 0
2 1mo 101 14 -16 -16
3 3mo 101 18 -12 -12
4 6mo0 101 13 -17 -17
5 12mo 101 2 -28 -28
6 2yr 101 9 -21 -21
7 scr 102 17 0 0
8 1mo 102 21 4 4
9 3mo 102 10 -7 -7
10 6mo0 102 4 -13 -13
11 12mo 102 19 2 2
12 2yr 102 13 -4 -4


Related Topics



Leave a reply



Submit