Rollmean with Dplyr and Magrittr

rollmean with dplyr and magrittr

May be this helps:

library(dplyr)
library(zoo)
data %>%
group_by(o) %>%
mutate(rM=rollmean(u,3, na.pad=TRUE, align="right"))

If you want to do for both columns, u and v

fun1 <- function(x) rollmean(x, 3, na.pad=TRUE, align="right")
data %>%
group_by(o) %>%
mutate_each(funs(fun1), u, v)

Conditionally Rollmean based on another column value

Include pitcher_hand in group_by -

library(dplyr)

FP %>%
group_by(player, pitcher_hand) %>%
mutate(FP_L3 = lag(rollmeanr(fantasy_points, 3, fill = NA))) %>%
ungroup

# player pitcher_hand fantasy_points FP_L3
# <chr> <chr> <dbl> <dbl>
# 1 A R 12.7 NA
# 2 A L 6.48 NA
# 3 A R 10.7 NA
# 4 A L 18.1 NA
# 5 A R 16.3 NA
# 6 A L 7.92 NA
# 7 A R 5.62 13.2
# 8 A L 22.5 10.8
# 9 A R 14.8 10.9
#10 A L 5.32 16.2
# … with 20 more rows

Rollmean over one month with different groups in R

Here are two equivalent alternatives.

In the first alternative below, the second argument to rollapplyr is a list such that the ith component is the vector of offsets to average over for the ith row of the group.

In the second alternative we can specify the width as a vector of widths, one per row, and then when taking the mean eliminate the last value.

Note that w is slightly different in the two alternatives.

Review ?rollapply for details on the arguments and for further examples.

library(dplyr, exclude = c("filter", "lag"))
library(zoo)

example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date) - 1,
avg30 = rollapplyr(ER, lapply(-w, seq, to = -1), mean, fill=NA)) %>%
ungroup

example %>%
arrange(username, Date) %>%
group_by(username) %>%
mutate(w = seq_along(Date) - findInterval(Date - 30, Date),
avg30 = rollapplyr(ER, w, function(x) mean(head(x, -1)), fill = NA)) %>%
ungroup

Differences between %.% (dplyr) and %% (magrittr)

dplyr now imports %>% from magrittr and uses it by default. See this answer for details.


Differences include

  • you can use a . as placeholder for the left-hand side, e.g.

     iris %>% plot(Sepal.Length ~ Sepal.Width, data = .)
  • %>% respects (rhs), e.g.

     1:10 %>% (call("sum"))
    1:10 %>% (function(x) x^2 + 2*x)

    For a more useful
    example of this, see

    https://gist.github.com/anonymous/0c69b019d0b4f6ae5050

  • For
    single argument function calls, you can omit parens:

     "2014-05-18" %>% as.Date

Adjust function to work with dplyr/magrittr

You can pass an ... parameter directly to the vars helper of summarise_at, e.g.

foo <- function(.tbl, ...){
summarise_at(.tbl,
vars(...),
funs(mean(unlist(.))))
}

It works for single variables, list column or not:

df %>% foo(b)
## # A tibble: 1 × 1
## b
## <dbl>
## 1 18.48936

or multiple:

df %>% foo(a, b)
## # A tibble: 1 × 2
## a b
## <dbl> <dbl>
## 1 1.5 18.48936

To go further with NSE, check out lazyeval, which is the package dplyr uses to implement its NSE.

Also note that the SE/NSE system of dplyr has just been rebuilt in the development version (not on CRAN yet, and not yet documented).


Bonus points: Do it all in base R!

foo <- function(.tbl, ...){
# collect dots as character vector
cols <- as.character(substitute(list(...))[-1])
cls <- class(.tbl)

# handle grouped tibbles properly
if('grouped_df' %in% cls){
cls <- cls[which(cls != 'grouped_df')] # drop grouping
res <- aggregate(.tbl[cols],
.tbl[attr(.tbl, 'vars')],
FUN = function(x){mean(unlist(x))})
} else {
res <- as.data.frame(lapply(.tbl[cols], function(x){mean(unlist(x))}))
}

class(res) <- cls # keep class (tibble, etc.)
res
}

which works with list columns, groups, and multiple columns or groups, keeping class but dropping grouping:

df %>% foo(a, b)
## # A tibble: 1 × 2
## a b
## <dbl> <dbl>
## 1 1.5 18.48936

df %>% group_by(a) %>% foo(b)
## # A tibble: 2 × 2
## a b
## <int> <dbl>
## 1 1 5.5
## 2 2 22.0

mtcars %>% foo(mpg, hp)
## mpg hp
## 1 20.09062 146.6875

mtcars %>% group_by(cyl, am) %>% foo(hp, mpg)
## # A tibble: 6 × 4
## cyl am hp mpg
## <dbl> <dbl> <dbl> <dbl>
## 1 4 0 84.66667 22.90000
## 2 6 0 115.25000 19.12500
## 3 8 0 194.16667 15.05000
## 4 4 1 81.87500 28.07500
## 5 6 1 131.66667 20.56667
## 6 8 1 299.50000 15.40000

how to make a rolling sum (or rolling average) with multiple variables

I did it using dplyr::lag()

library(dplyr)
library(tibble)

## Data
data <- tribble(
~Date, ~Prod, ~Amount,
"2010-01-28", "Corn", 1,
"2010-01-28", "Potato", 2,
"2010-02-28", "Corn", 3,
"2010-02-28", "Potato", 4,
"2010-03-28", "Corn", 5,
"2010-03-28", "Potato", 6,
"2010-04-28", "Corn", 7,
"2010-04-28", "Potato", 8
)

# Code

data %>%
group_by(Prod) %>%
mutate(cum_amount = Amount + lag(Amount, 1) + lag(Amount, 2)) %>%
filter(!is.na(cum_amount))

# A tibble: 4 x 4
# Groups: Prod [2]
Date Prod Amount cum_amount
<chr> <chr> <dbl> <dbl>
1 2010-03-28 Corn 5 9
2 2010-03-28 Potato 6 12
3 2010-04-28 Corn 7 15
4 2010-04-28 Potato 8 18

Update in order to your comment

data %>% 
group_by(Prod) %>%
mutate(cum_amount = c(rep(NA, 2), zoo::rollsum(Amount, 3))) %>%
filter(!is.na(cum_amount))

PS: Remember to include the R tag in your questions

Dplyr or Magrittr - tolower?

Using magrittr's "compound assignment pipe-operator" %<>% might be, if I understand your question correctly, an even more succinct option.

library("magrittr")
names(iris) %<>% tolower

?`%<>%` # for more

offset rollapply in R

All you have to do is use the exact same call to zoo::rollapply but with a window of size 4, lagged 4.

library(dplyr)
library(zoo)

dfx %>%
mutate(roll_1 = lag(rollapply(B, 3, mean, fill=NA, align="right"),1),
roll_2 = lag(rollapply(B, 4, mean, fill=NA, align="right"),4))
#> A B roll_1 roll_2
#> 1 1 0 NA NA
#> 2 2 0 NA NA
#> 3 3 2 NA NA
#> 4 4 2 0.6666667 NA
#> 5 5 0 1.3333333 NA
#> 6 6 0 1.3333333 NA
#> 7 7 1 0.6666667 NA
#> 8 8 0 0.3333333 1
#> 9 9 0 0.3333333 1
#> 10 10 0 0.3333333 1

How to use base::rowSums() with dplyr/magrittr pipe (%%)

iris[1:5, 1:4] %>% is_less_than(5) %>% rowSums
# 1 2 3 4 5
# 3 4 4 4 3

would be magrittr equivalent to (no need in dplyr here)

rowSums(iris[1:5, 1:4] < 5)  
# 1 2 3 4 5
# 3 4 4 4 3


Related Topics



Leave a reply



Submit