R subtract value for the same ID (from the first ID that shows)
Let's assume that dates are already sorted. I would probably retrieve the first value for each id and then use this to compute the diff feature.
Something like this.
my.df <- data.frame(
id = c(2380, 2380, 2380, 2380, 20100,20100,20100, 20100, 20103, 20103),
date = c("10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12", "11/1/12", "11/2/12", "10/30/12", "10/31/12"),
value = c(21.01, 22.04, 22.65, 23.11, 35.21, 37.07, 38.17, 38.97, 57.98, 60.83),
stringsAsFactors = F)
#
# get ids
my.ids <- unique(my.df$id) # or levels(my.df$id)
# get first val (assuming sorting by date)
id.val0 <- sapply(my.ids, (function(id){
my.df$value[my.df$id == id][1]
}))
names(id.val0) <- my.ids
# do operation
my.df$diff <- sapply(1:nrow(my.df), (function(i){
tmp.id <- my.df$id[i]
my.df$value[i] - id.val0[as.character(tmp.id)]
}))
How to subtract value from previous observation ID?
Another option is match
df1$ValueLessPrevious <- with(df1, Value -
Value[match(PreviousObsID, ObservationID)])
df1$ValueLessPrevious
#[1] 25 -35 -240 NA
Subtract a value based on the first instance of a group in another column
You may try
library(dplyr)
library(data.table)
df %>%
group_by(data.table::rleid( category)) %>%
mutate(ctime = cumsum(time)) %>%
mutate(val1 = ifelse(startsWith(category, "A"),ctime - 200, ctime )) %>%
filter(val1>0) %>%
mutate(time = val1 - ifelse(is.na(lag(val1)), 0, lag(val1))) %>%
ungroup %>%
select(time, look, category)
time look category
<dbl> <chr> <chr>
1 150 left B1
2 170 right B1
3 100 left B1
4 370 right A1
5 100 left A1
6 100 right A2
7 100 left A2
8 100 right A2
9 100 left B1
10 150 right B1
11 200 away B1
12 100 left B1
subtract first or second value from each row
We can use the first
from dplyr
test %>%
group_by(two) %>%
mutate(new=three- first(three))
# A tibble: 6 x 4
# Groups: two [2]
# one two three new
# <chr> <chr> <int> <int>
#1 c a 1 0
#2 d a 2 1
#3 e a 3 2
#4 c b 4 0
#5 d b 5 1
#6 e b 6 2
If we are subsetting the 'three' values based on string "c" in 'one', then we don't need .$
as it will get the whole column 'c' instead of the values within the group by column
test %>%
group_by(`two`) %>%
mutate(new=three-three[one=="c"])
subtract value from previous row by group
With dplyr
:
library(dplyr)
data %>%
group_by(id) %>%
arrange(date) %>%
mutate(diff = value - lag(value, default = first(value)))
For clarity you can arrange
by date
and grouping column (as per comment by lawyer)
data %>%
group_by(id) %>%
arrange(date, .by_group = TRUE) %>%
mutate(diff = value - lag(value, default = first(value)))
or lag
with order_by
:
data %>%
group_by(id) %>%
mutate(diff = value - lag(value, default = first(value), order_by = date))
With data.table
:
library(data.table)
dt <- as.data.table(data)
setkey(dt, id, date)
dt[, diff := value - shift(value, fill = first(value)), by = id]
Subtract previous rows from first row by ID in R
First create a cumulative sum on the original DRSG
column grouped by ID
. We can then use this new column in conjunction with the original ID
as our group. From there we leverage dplyr::first
to do the subtraction.
library(tidyverse)
data %>%
mutate(Date = as.Date(Date)) %>%
group_by(ID) %>%
mutate(increment_DRSG = cumsum(DRSG)) %>%
group_by(ID, increment_DRSG) %>%
mutate(days = Date - first(Date))
Subtract values within group
Well, its a somewhat odd calculation, but slightly to my own surprise, the following seems to do what you explain:
set.seed(42)
ID <- sample(1:15, 100, replace = TRUE)
value <- sample(1:4, 100, replace = TRUE)
d <- data.frame(ID, value)
d %>% group_by( ID ) %>%
mutate(
value_c = value*2 - sum(value)
) %>%
arrange( ID ) %>%
head( n=20 )
Produces:
# A tibble: 20 x 3
# Groups: ID [3]
ID value value_c
<int> <int> <dbl>
1 1 1 -12
2 1 1 -12
3 1 4 -6
4 1 1 -12
5 1 1 -12
6 1 2 -10
7 1 4 -6
8 2 4 -21
9 2 3 -23
10 2 3 -23
11 2 2 -25
12 2 1 -27
13 2 1 -27
14 2 3 -23
15 2 3 -23
16 2 1 -27
17 2 4 -21
18 2 4 -21
19 3 4 -8
20 3 4 -8
You multiply value by 2 because its going to be in the sum() anyway, which you didn't want, so adding it back on the left side takes care of that.
Subtracting one column by another by the first row in every 5 rows in dataframe
This will work, we just need to group by id
, then take advantage of the first()
function to take the difference versus the first value of x for each group.
library(tidyverse)
df1 %>% group_by(id) %>% mutate(new = x - first(x))
# A tibble: 12 x 5
# Groups: id [2]
visit id x want new
<fct> <chr> <int> <dbl> <int>
1 scr 101 30 0 0
2 1mo 101 14 -16 -16
3 3mo 101 18 -12 -12
4 6mo0 101 13 -17 -17
5 12mo 101 2 -28 -28
6 2yr 101 9 -21 -21
7 scr 102 17 0 0
8 1mo 102 21 4 4
9 3mo 102 10 -7 -7
10 6mo0 102 4 -13 -13
11 12mo 102 19 2 2
12 2yr 102 13 -4 -4
Related Topics
Control Speed of a Gganimation
Differencebetween Aes and Aes_String (Ggplot2) in R
Missing Data When Supplying a Dual-Axis--Multiple-Traces to Subplot
How to Change the Default Directory in Rstudio (Or R)
Installing Package from a Local .Tar.Gz File on Linux
Insert Function Variable into Graph Title
Renaming and Hiding an Exported Rcpp Function in an R Package
Ggplot Bar Plot Side by Side Using Two Variables
How Is J() Function Implemented in Data.Table
Cast String Directly to Idatetime
How to Merge Two Nodes into a Single Node Using Igraph
Include Text Control Characters in Plotmath Expressions
Match Two Columns with Two Other Columns
Dygraph in R Multiple Plots at Once
Draw Lines Between Different Elements in a Stacked Bar Plot
Differencebetween Short (&,|) and Long (&&, ||) Forms of And, or Logical Operators in R