How to Get the Cumulative Sum by Group in R

Calculate cumulative sum (cumsum) by group

df$csum <- ave(df$value, df$id, FUN=cumsum)

ave is the "go-to" function if you want a by-group vector of equal length to an existing vector and it can be computed from those sub vectors alone. If you need by-group processing based on multiple "parallel" values, the base strategy is do.call(rbind, by(dfrm, grp, FUN)).

Cumulative sum with `all` or `any` by group

You're not actually doing a cumsum--nothing needs to be summed. You are looking for the row number within the group.

Here are a couple ways with dplyr:

df %>%
group_by(group) %>%
mutate(
result1 = row_number() * any(y %% 3 == 0),
result2 = case_when(
any(y %% 3 == 0) ~ row_number(),
TRUE ~ 0L
)
)
# # A tibble: 12 × 4
# # Groups: group [6]
# group y result1 result2
# <int> <int> <int> <int>
# 1 1 1 0 0
# 2 1 2 0 0
# 3 2 3 1 1
# 4 2 4 2 2
# 5 3 5 1 1
# 6 3 6 2 2
# 7 4 7 0 0
# 8 4 8 0 0
# 9 5 9 1 1
# 10 5 10 2 2
# 11 6 11 1 1
# 12 6 12 2 2

How to get the cumulative sum by group in R?

library(data.table)

# convert to data.table in place
setDT(df)

# dcast and do individual sums
dt.cast = dcast.data.table(df, group ~ rep, value.var = 'value',
fun.aggregate = sum)
dt.cast
# group d1 d2
#1: 0 0 1
#2: 1 1 2

# cumsum
dt.cast[, as.list(cumsum(unlist(.SD))), by = group]
# group d1 d2
#1: 0 0 1
#2: 1 1 3

Conditional cumulative sum and grouping in R

I am able to find an answer to it, credit to the link .

myDat %>% mutate(cumsum_15 = accumulate(Freq, ~ifelse(.x + .y <= 15000000, .x + .y, .y)),
group_15 = cumsum(Freq == cumsum_10))

Group according to cumulative sums

Two possible one-liners, with purrr::accumulate and with MESS::cumsumbinning:

purrr::accumulate

library(tidyverse)
group_by(input, grp = LETTERS[cumsum(value == accumulate(value, ~ ifelse(.x + .y <= 100, .x + .y, .y)))])

MESS::cumsumbinning

library(dplyr)
group_by(input, grp = LETTERS[MESS::cumsumbinning(value, 100)])

output

# A tibble: 6 x 3
# Groups: grp [3]
id value grp
<int> <dbl> <chr>
1 1 99 A
2 2 1 A
3 3 33 B
4 4 33 B
5 5 33 B
6 6 150 C

Group by cumulative sums with conditions

Use na.locf0 from zoo to fill in the NAs and then apply rleid from data.table:

library(data.table)
library(zoo)

rleid(na.locf0(df$ID))
## [1] 1 2 2 2 2 3 4 4 5 5 5

Cumulative total by group

If you just want cumulative sums per group, then you can do

transform(d, new=ave(value,group,FUN=cumsum))

with base R.

Cumulative sum in R by group and start over when sum of values in group larger than maximum value

One purrr approach could be:

cumsum(c(FALSE, diff(accumulate(test, ~ ifelse(.x >= 10, .y, .x + .y))) <= 0))

[1] 0 0 1 1 1 2 2 2 3


Related Topics



Leave a reply



Submit