R - Cumulative Sum by Condition

Conditional cumulative sum and grouping in R

I am able to find an answer to it, credit to the link .

myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x + .y <= 15000000, .x + .y, .y)),
group_15 = cumsum(Freq == cumsum_10))

Conditional cumulative sum from two columns

You were essentially in the right direction. Since you provide an .init value to accumulate, the resulting vector is of size n+1, with the first value being .init. You have to remove the first value to get a vector that fit to your column size.

Then, if you want NAs on the remaining values, here's a way to do it. Also, since the "starting row" is the third, .init has to be set to 8.

df %>%
mutate(test =
ifelse(source == "B", accumulate(add, .init = 8, ~.x + .y)[-1], NA))

# A tibble: 6 x 4
source value add test
<chr> <dbl> <dbl> <dbl>
1 A 5 1 NA
2 A 10 1 NA
3 B NA 1 11
4 B NA 2 13
5 B NA 3 16
6 C 20 4 NA

Conditional cumulative sum with dplyr

Since converting a logical to numeric gives 0 for FALSE and 1 for TRUE, you can simply multiply sales by act :

library(dplyr)
df %>% group_by(prod) %>%
mutate(cum_sales = cumsum(sales*act))

prod act sales cum_sales
<fctr> <lgl> <dbl> <dbl>
1 A TRUE 100 100
2 A TRUE 120 220
3 A TRUE 190 410
4 A FALSE 50 410
5 B TRUE 30 30
6 B TRUE 40 70
7 B FALSE 50 70
8 B FALSE 10 70
9 B FALSE 30 70

Group by cumulative sums with conditions

Use na.locf0 from zoo to fill in the NAs and then apply rleid from data.table:

library(data.table)
library(zoo)

rleid(na.locf0(df$ID))
## [1] 1 2 2 2 2 3 4 4 5 5 5

R how to cumulative sums up until condition, including the row where the condition changes

We can create a grouping column based on the logical column by taking the cumulative sum and getting the lag of that output, then do the cumsum on the column 'b'

library(dplyr)
df1 %>%
group_by(grp = lag(cumsum(a), default = 0)) %>%
mutate(c = row_number(), d = cumsum(b)) %>%
ungroup %>%
select(-grp)

-output

# A tibble: 7 x 4
# a b c d
# <lgl> <dbl> <int> <dbl>
#1 FALSE 30.5 1 30.5
#2 FALSE 27.8 2 58.3
#3 FALSE 26.9 3 85.3
#4 TRUE 41.7 4 127.
#5 FALSE 2.86 1 2.86
#6 FALSE 16.3 2 19.2
#7 TRUE 40.2 3 59.4


Or using data.table with the same logic, grouped by the shift of cumulative sum of 'a', create the 'd' column as the cumsum of 'b',

library(data.table)
setDT(df1)[, c('c', 'd') := .(1:.N, cumsum(b)),
.(grp = shift(cumsum(a), fill = 0))]

data

df1 <- structure(list(a = c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, 
TRUE), b = c(30.53, 27.8, 26.93, 41.66, 2.86, 16.31, 40.19)),
class = "data.frame", row.names = c(NA,
-7L))

Cumulative sum while a condition is met with NA

You could use ave.

ave(b, a, FUN=\(x) {r <- cumsum(replace(x, is.na(x), 0)); replace(r, is.na(x), NA)})
# [1] NA NA NA 3 NA NA 8 NA NA 4 9 NA 11 NA 1 NA NA 2

cumsum with a condition to restart in R

You may use cumsum to create groups as well.

library(dplyr)

df <- df %>%
group_by(group = cumsum(dplyr::lag(port == 0, default = 0))) %>%
mutate(cumsum_G = cumsum(G)) %>%
ungroup

df

# inv ass port G group cumsum_G
# <chr> <chr> <int> <int> <dbl> <int>
#1 i x 2 1 0 1
#2 i x 2 0 0 1
#3 i x 0 1 0 2
#4 i x 3 0 1 0
#5 i x 3 1 1 1

You may remove the group column from output using %>% select(-group).

data

df <- structure(list(inv = c("i", "i", "i", "i", "i"), ass = c("x", 
"x", "x", "x", "x"), port = c(2L, 2L, 0L, 3L, 3L), G = c(1L,
0L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -5L))


Related Topics



Leave a reply



Submit