R Cumsum Per Group in Dplyr

r cumsum per group in dplyr

Ah. After fiddling around I seem to have found it.

pdf = df %>% group_by(group) %>% arrange(dates) %>% mutate(cs = cumsum(sales))

Output with forloop in question:

> pdf = data.frame(dates=as.Date(as.character()), group=as.character(), sales=as.numeric())
> for(grp in unique(df$group)){
+   subs = filter(df, group == grp) %>% arrange(dates)
+   pdf = rbind(pdf, data.frame(dates=subs$dates, group=grp, sales=subs$sales, cs=cumsum(subs$sales)))
+ }
> pdf
        dates group       sales         cs
1  2014-01-02     A -0.56047565 -0.5604756
2  2014-01-03     A -0.23017749 -0.7906531
3  2014-01-04     A  1.55870831  0.7680552
4  2014-01-05     A  0.07050839  0.8385636
5  2014-01-06     A  0.12928774  0.9678513
6  2014-01-02     B  1.71506499  1.7150650
7  2014-01-03     B  0.46091621  2.1759812
8  2014-01-04     B -1.26506123  0.9109200
9  2014-01-05     B -0.68685285  0.2240671
10 2014-01-06     B -0.44566197 -0.2215949
11 2014-01-02     C  1.22408180  1.2240818
12 2014-01-03     C  0.35981383  1.5838956
13 2014-01-04     C  0.40077145  1.9846671
14 2014-01-05     C  0.11068272  2.0953498
15 2014-01-06     C -0.55584113  1.5395087

Output with this line of code:

> pdf = df %>% group_by(group) %>% mutate(cs = cumsum(sales))
> pdf
Source: local data frame [15 x 4]
Groups: group

        dates group       sales         cs
1  2014-01-02     A -0.56047565 -0.5604756
2  2014-01-03     A -0.23017749 -0.7906531
3  2014-01-04     A  1.55870831  0.7680552
4  2014-01-05     A  0.07050839  0.8385636
5  2014-01-06     A  0.12928774  0.9678513
6  2014-01-02     B  1.71506499  1.7150650
7  2014-01-03     B  0.46091621  2.1759812
8  2014-01-04     B -1.26506123  0.9109200
9  2014-01-05     B -0.68685285  0.2240671
10 2014-01-06     B -0.44566197 -0.2215949
11 2014-01-02     C  1.22408180  1.2240818
12 2014-01-03     C  0.35981383  1.5838956
13 2014-01-04     C  0.40077145  1.9846671
14 2014-01-05     C  0.11068272  2.0953498
15 2014-01-06     C -0.55584113  1.5395087

How to do cumsum of 2 groups by dplyr?

Looks like you just need

dat %>% 
  group_by(item, choice) %>% 
  summarize(n=n()) %>% 
  mutate(cum = cumsum(n))

cumsum in grouped data with dplyr

When you group by local.Authority & year it takes unique values and print the result as 1,-1,1 so better group by only local.Authority where cumsum works based on total values and result 1,0,1

 df <- df %>%
      group_by(Local.Authority) %>%
      mutate(cum.to = cumsum(total))

    > df
    Source: local data frame [3 x 8]
    Groups: Local.Authority [1]

      Provider.ID Local.Authority month  year entry  exit total cum.to
            <chr>           <chr> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
    1 1-102642676            Kent    10  2010     1     0     1      1
    2 1-102642676            Kent     9  2011     0     1    -1      0
    3 1-102642676            Kent    10  2014     1     0     1      1

Cumulative count by group over time in r

You can use count and cumsum -

library(dplyr)

df %>%
  count(group, time, name = 'count') %>%
  group_by(group) %>%
  mutate(count = cumsum(count)) %>%
  ungroup

#   group  time count
#  <chr> <dbl> <int>
#1 A         1     2
#2 A         2     3
#3 A         3     4
#4 B         1     1
#5 B         2     3
#6 C         1     1
#7 C         2     3
#8 C         3     5

Cumulative sum with `all` or `any` by group

You're not actually doing a cumsum--nothing needs to be summed. You are looking for the row number within the group.

Here are a couple ways with dplyr:

df %>%
  group_by(group) %>%
  mutate(
    result1 = row_number() * any(y %% 3 == 0),
    result2 = case_when(
      any(y %% 3 == 0) ~ row_number(),
      TRUE ~ 0L
    )
  )
# # A tibble: 12 × 4
# # Groups:   group [6]
#    group     y result1 result2
#    <int> <int>   <int>   <int>
#  1     1     1       0       0
#  2     1     2       0       0
#  3     2     3       1       1
#  4     2     4       2       2
#  5     3     5       1       1
#  6     3     6       2       2
#  7     4     7       0       0
#  8     4     8       0       0
#  9     5     9       1       1
# 10     5    10       2       2
# 11     6    11       1       1
# 12     6    12       2       2

Calculate cumulative sum (cumsum) by group

df$csum <- ave(df$value, df$id, FUN=cumsum)

ave is the "go-to" function if you want a by-group vector of equal length to an existing vector and it can be computed from those sub vectors alone. If you need by-group processing based on multiple "parallel" values, the base strategy is do.call(rbind, by(dfrm, grp, FUN)).

R Cumsum Per Group in Dplyr