rollmean with dplyr and magrittr
May be this helps:
library(dplyr)
library(zoo)
data %>%
group_by(o) %>%
mutate(rM=rollmean(u,3, na.pad=TRUE, align="right"))
If you want to do for both columns, u
and v
fun1 <- function(x) rollmean(x, 3, na.pad=TRUE, align="right")
data %>%
group_by(o) %>%
mutate_each(funs(fun1), u, v)
Conditionally Rollmean based on another column value
Include pitcher_hand
in group_by
-
library(dplyr)
FP %>%
group_by(player, pitcher_hand) %>%
mutate(FP_L3 = lag(rollmeanr(fantasy_points, 3, fill = NA))) %>%
ungroup
# player pitcher_hand fantasy_points FP_L3
# <chr> <chr> <dbl> <dbl>
# 1 A R 12.7 NA
# 2 A L 6.48 NA
# 3 A R 10.7 NA
# 4 A L 18.1 NA
# 5 A R 16.3 NA
# 6 A L 7.92 NA
# 7 A R 5.62 13.2
# 8 A L 22.5 10.8
# 9 A R 14.8 10.9
#10 A L 5.32 16.2
# … with 20 more rows
Rollmean over one month with different groups in R
Here are two equivalent alternatives.
In the first alternative below, the second argument to rollapplyr is a list such that the ith component is the vector of offsets to average over for the ith row of the group.
In the second alternative we can specify the width as a vector of widths, one per row, and then when taking the mean eliminate the last value.
Note that w is slightly different in the two alternatives.
Review ?rollapply
for details on the arguments and for further examples.
library(dplyr, exclude = c("filter", "lag"))
library(zoo)
example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date) - 1,
avg30 = rollapplyr(ER, lapply(-w, seq, to = -1), mean, fill=NA)) %>%
ungroup
example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date),
avg30 = rollapplyr(ER, w, function(x) mean(head(x, -1)), fill = NA)) %>%
ungroup
Differences between %.% (dplyr) and %% (magrittr)
dplyr now imports %>%
from magrittr and uses it by default. See this answer for details.
Differences include
you can use a
.
as placeholder for the left-hand side, e.g.iris %>% plot(Sepal.Length ~ Sepal.Width, data = .)
%>%
respects(rhs)
, e.g.1:10 %>% (call("sum"))
1:10 %>% (function(x) x^2 + 2*x)For a more useful
example of this, see
https://gist.github.com/anonymous/0c69b019d0b4f6ae5050For
single argument function calls, you can omit parens:"2014-05-18" %>% as.Date
Adjust function to work with dplyr/magrittr
You can pass an ...
parameter directly to the vars
helper of summarise_at
, e.g.
foo <- function(.tbl, ...){
summarise_at(.tbl,
vars(...),
funs(mean(unlist(.))))
}
It works for single variables, list column or not:
df %>% foo(b)
## # A tibble: 1 × 1
## b
## <dbl>
## 1 18.48936
or multiple:
df %>% foo(a, b)
## # A tibble: 1 × 2
## a b
## <dbl> <dbl>
## 1 1.5 18.48936
To go further with NSE, check out lazyeval, which is the package dplyr uses to implement its NSE.
Also note that the SE/NSE system of dplyr has just been rebuilt in the development version (not on CRAN yet, and not yet documented).
Bonus points: Do it all in base R!
foo <- function(.tbl, ...){
# collect dots as character vector
cols <- as.character(substitute(list(...))[-1])
cls <- class(.tbl)
# handle grouped tibbles properly
if('grouped_df' %in% cls){
cls <- cls[which(cls != 'grouped_df')] # drop grouping
res <- aggregate(.tbl[cols],
.tbl[attr(.tbl, 'vars')],
FUN = function(x){mean(unlist(x))})
} else {
res <- as.data.frame(lapply(.tbl[cols], function(x){mean(unlist(x))}))
}
class(res) <- cls # keep class (tibble, etc.)
res
}
which works with list columns, groups, and multiple columns or groups, keeping class but dropping grouping:
df %>% foo(a, b)
## # A tibble: 1 × 2
## a b
## <dbl> <dbl>
## 1 1.5 18.48936
df %>% group_by(a) %>% foo(b)
## # A tibble: 2 × 2
## a b
## <int> <dbl>
## 1 1 5.5
## 2 2 22.0
mtcars %>% foo(mpg, hp)
## mpg hp
## 1 20.09062 146.6875
mtcars %>% group_by(cyl, am) %>% foo(hp, mpg)
## # A tibble: 6 × 4
## cyl am hp mpg
## <dbl> <dbl> <dbl> <dbl>
## 1 4 0 84.66667 22.90000
## 2 6 0 115.25000 19.12500
## 3 8 0 194.16667 15.05000
## 4 4 1 81.87500 28.07500
## 5 6 1 131.66667 20.56667
## 6 8 1 299.50000 15.40000
how to make a rolling sum (or rolling average) with multiple variables
I did it using dplyr::lag()
library(dplyr)
library(tibble)
## Data
data <- tribble(
~Date, ~Prod, ~Amount,
"2010-01-28", "Corn", 1,
"2010-01-28", "Potato", 2,
"2010-02-28", "Corn", 3,
"2010-02-28", "Potato", 4,
"2010-03-28", "Corn", 5,
"2010-03-28", "Potato", 6,
"2010-04-28", "Corn", 7,
"2010-04-28", "Potato", 8
)
# Code
data %>%
group_by(Prod) %>%
mutate(cum_amount = Amount + lag(Amount, 1) + lag(Amount, 2)) %>%
filter(!is.na(cum_amount))
# A tibble: 4 x 4
# Groups: Prod [2]
Date Prod Amount cum_amount
<chr> <chr> <dbl> <dbl>
1 2010-03-28 Corn 5 9
2 2010-03-28 Potato 6 12
3 2010-04-28 Corn 7 15
4 2010-04-28 Potato 8 18
Update in order to your comment
data %>%
group_by(Prod) %>%
mutate(cum_amount = c(rep(NA, 2), zoo::rollsum(Amount, 3))) %>%
filter(!is.na(cum_amount))
PS: Remember to include the R tag in your questions
Dplyr or Magrittr - tolower?
Using magrittr
's "compound assignment pipe-operator" %<>%
might be, if I understand your question correctly, an even more succinct option.
library("magrittr")
names(iris) %<>% tolower
?`%<>%` # for more
offset rollapply in R
All you have to do is use the exact same call to zoo::rollapply
but with a window of size 4, lagged 4.
library(dplyr)
library(zoo)
dfx %>%
mutate(roll_1 = lag(rollapply(B, 3, mean, fill=NA, align="right"),1),
roll_2 = lag(rollapply(B, 4, mean, fill=NA, align="right"),4))
#> A B roll_1 roll_2
#> 1 1 0 NA NA
#> 2 2 0 NA NA
#> 3 3 2 NA NA
#> 4 4 2 0.6666667 NA
#> 5 5 0 1.3333333 NA
#> 6 6 0 1.3333333 NA
#> 7 7 1 0.6666667 NA
#> 8 8 0 0.3333333 1
#> 9 9 0 0.3333333 1
#> 10 10 0 0.3333333 1
How to use base::rowSums() with dplyr/magrittr pipe (%%)
iris[1:5, 1:4] %>% is_less_than(5) %>% rowSums
# 1 2 3 4 5
# 3 4 4 4 3
would be magrittr
equivalent to (no need in dplyr
here)
rowSums(iris[1:5, 1:4] < 5)
# 1 2 3 4 5
# 3 4 4 4 3
Related Topics
Why Is Stat = "Identity" Necessary in Geom_Bar in Ggplot
Texture in Barplot for 7 Bars in R
Problems Using Foreach Parallelization
How to Change the Na Color from Gray to White in a Ggplot Choropleth Map
Dynamic Height and Width for Knitr Plots
How to Get Parameters from Config File in R Script
How to Display Widgets Inline in Shiny
How to Convert Date and Time from Character to Datetime Type
Colorize Clusters in Dendogram with Ggplot2
Plotting Multiple Curves Same Graph and Same Scale
Calculating the Difference Between Consecutive Rows by Group Using Dplyr
How to Plot Logit and Probit in Ggplot2
Using Geo-Coordinates as Vertex Coordinates in the Igraph R-Package
How to Leave the R Browser() Mode in the Console Window
Forcing R Output to Be Scientific Notation with at Most Two Decimals
Show That Shiny Is Busy (Or Loading) When Changing Tab Panels