r cumsum per group in dplyr
Ah. After fiddling around I seem to have found it.
pdf = df %>% group_by(group) %>% arrange(dates) %>% mutate(cs = cumsum(sales))
Output with forloop in question:
> pdf = data.frame(dates=as.Date(as.character()), group=as.character(), sales=as.numeric())
> for(grp in unique(df$group)){
+ subs = filter(df, group == grp) %>% arrange(dates)
+ pdf = rbind(pdf, data.frame(dates=subs$dates, group=grp, sales=subs$sales, cs=cumsum(subs$sales)))
+ }
> pdf
dates group sales cs
1 2014-01-02 A -0.56047565 -0.5604756
2 2014-01-03 A -0.23017749 -0.7906531
3 2014-01-04 A 1.55870831 0.7680552
4 2014-01-05 A 0.07050839 0.8385636
5 2014-01-06 A 0.12928774 0.9678513
6 2014-01-02 B 1.71506499 1.7150650
7 2014-01-03 B 0.46091621 2.1759812
8 2014-01-04 B -1.26506123 0.9109200
9 2014-01-05 B -0.68685285 0.2240671
10 2014-01-06 B -0.44566197 -0.2215949
11 2014-01-02 C 1.22408180 1.2240818
12 2014-01-03 C 0.35981383 1.5838956
13 2014-01-04 C 0.40077145 1.9846671
14 2014-01-05 C 0.11068272 2.0953498
15 2014-01-06 C -0.55584113 1.5395087
Output with this line of code:
> pdf = df %>% group_by(group) %>% mutate(cs = cumsum(sales))
> pdf
Source: local data frame [15 x 4]
Groups: group
dates group sales cs
1 2014-01-02 A -0.56047565 -0.5604756
2 2014-01-03 A -0.23017749 -0.7906531
3 2014-01-04 A 1.55870831 0.7680552
4 2014-01-05 A 0.07050839 0.8385636
5 2014-01-06 A 0.12928774 0.9678513
6 2014-01-02 B 1.71506499 1.7150650
7 2014-01-03 B 0.46091621 2.1759812
8 2014-01-04 B -1.26506123 0.9109200
9 2014-01-05 B -0.68685285 0.2240671
10 2014-01-06 B -0.44566197 -0.2215949
11 2014-01-02 C 1.22408180 1.2240818
12 2014-01-03 C 0.35981383 1.5838956
13 2014-01-04 C 0.40077145 1.9846671
14 2014-01-05 C 0.11068272 2.0953498
15 2014-01-06 C -0.55584113 1.5395087
How to do cumsum of 2 groups by dplyr?
Looks like you just need
dat %>%
group_by(item, choice) %>%
summarize(n=n()) %>%
mutate(cum = cumsum(n))
cumsum in grouped data with dplyr
When you group by local.Authority & year it takes unique values and print the result as 1,-1,1 so better group by only local.Authority where cumsum works based on total values and result 1,0,1
df <- df %>%
group_by(Local.Authority) %>%
mutate(cum.to = cumsum(total))
> df
Source: local data frame [3 x 8]
Groups: Local.Authority [1]
Provider.ID Local.Authority month year entry exit total cum.to
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1-102642676 Kent 10 2010 1 0 1 1
2 1-102642676 Kent 9 2011 0 1 -1 0
3 1-102642676 Kent 10 2014 1 0 1 1
Cumulative count by group over time in r
You can use count
and cumsum
-
library(dplyr)
df %>%
count(group, time, name = 'count') %>%
group_by(group) %>%
mutate(count = cumsum(count)) %>%
ungroup
# group time count
# <chr> <dbl> <int>
#1 A 1 2
#2 A 2 3
#3 A 3 4
#4 B 1 1
#5 B 2 3
#6 C 1 1
#7 C 2 3
#8 C 3 5
Cumulative sum with `all` or `any` by group
You're not actually doing a cumsum
--nothing needs to be summed. You are looking for the row number within the group.
Here are a couple ways with dplyr
:
df %>%
group_by(group) %>%
mutate(
result1 = row_number() * any(y %% 3 == 0),
result2 = case_when(
any(y %% 3 == 0) ~ row_number(),
TRUE ~ 0L
)
)
# # A tibble: 12 × 4
# # Groups: group [6]
# group y result1 result2
# <int> <int> <int> <int>
# 1 1 1 0 0
# 2 1 2 0 0
# 3 2 3 1 1
# 4 2 4 2 2
# 5 3 5 1 1
# 6 3 6 2 2
# 7 4 7 0 0
# 8 4 8 0 0
# 9 5 9 1 1
# 10 5 10 2 2
# 11 6 11 1 1
# 12 6 12 2 2
Calculate cumulative sum (cumsum) by group
df$csum <- ave(df$value, df$id, FUN=cumsum)
ave
is the "go-to" function if you want a by-group vector of equal length to an existing vector and it can be computed from those sub vectors alone. If you need by-group processing based on multiple "parallel" values, the base strategy is do.call(rbind, by(dfrm, grp, FUN))
.
Related Topics
Remove Parenthesis from a Character String
Extract Non Null Elements from a List in R
Is There a Reason to Prefer Extractor Functions to Accessing Attributes with $
Using Grep in R to Delete Rows from a Data.Frame
R:Ggplot2:Facet_Grid:How Include Math Expressions in Few (Not All) Labels
How to Get Axis Ticks Labels with Different Colors Within a Single Axis for a Ggplot Graph
How to Fix Outofmemoryerror (Java): Gc Overhead Limit Exceeded in R
How to Specify Command Line Parameters to R-Script in Rstudio
Change Default Prompt and Output Line Prefix in R
Ggplot2: Geom_Text Resize with the Plot and Force/Fit Text Within Geom_Bar
How to Filter a Range of Numbers in R
R Table Function: How to Sum Instead of Counting
How to Screenshot a Website Using R
Model.Matrix() with Na.Action=Null