Rolling by Group in Data.Table R

Rolling by group in data.table R

When you use dt$return the whole data.table is picked internally within the groups. Just use the column you need in the function definition and it will work fine:

#use the column instead of the data.table
zoo_fun <- function(column, N) {
  (rollapply(column + 1, N, FUN=prod, fill=NA, align='right') - 1)
}

#now it works fine
test[, momentum := zoo_fun(return, 3), by = sec]

As a separate note, you should probably not use return as a column or variable name.

Out:

> test
    return sec momentum
 1:    0.1   A       NA
 2:    0.1   A       NA
 3:    0.1   A    0.331
 4:    0.1   A    0.331
 5:    0.1   A    0.331
 6:    0.2   B       NA
 7:    0.2   B       NA
 8:    0.2   B    0.728
 9:    0.2   B    0.728
10:    0.2   B    0.728

Rolling Mean By Group Dplyr/data.table

you could use map from the purrr package and apply it on 1:n():

df = df %>% 
  na.omit() %>% 
  group_by(ticker) %>% 
  mutate(avg10 = map_dbl(1:n(), ~mean(lag_close[(max(.x-9, 1)):.x], na.rm =T))

Of course you have to decide what should happen with the first 9 rows where there are fewer than 10 observations. In my solution the rows 1 to 9 contain the mean of the last 1 to 9 observations.

Apply a rolling function by group in r (zoo, data.table)

Add fill=NA to rollapply. This will ensure that a vector of length 50 (rather than 49) is returned, with NA as the first value (since align="right"), avoiding recycling.

A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]

    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species      stdev
1            5.1         3.5          1.4         0.2     setosa         NA
2            4.9         3.0          1.4         0.2     setosa 0.00000000
3            4.7         3.2          1.3         0.2     setosa 0.00000000
...
51           7.0         3.2          4.7         1.4 versicolor         NA
52           6.4         3.2          4.5         1.5 versicolor 0.07071068
53           6.9         3.1          4.9         1.5 versicolor 0.00000000
...
101          6.3         3.3          6.0         2.5  virginica         NA
102          5.8         2.7          5.1         1.9  virginica 0.42426407
103          7.1         3.0          5.9         2.1  virginica 0.14142136

Rolling join with group ID in both tables

We can use a rolling join with data.table

library(data.table)
table_to_join[, date_ad := date_sale][original_table, 
       on = .(article, date_sale = date_ad), roll = -Inf]
#   article  date_sale    date_ad
#1:       A 2010-04-09 2010-12-15
#2:       A 2011-07-12 2012-08-20
#3:       B 2015-04-13 2016-01-05
#4:       B 2016-08-12 2017-01-20

I think OP was confused with the naming of the output columns. Using an update by reference should be clearer:

table_to_join[, date_ad := date_sale]
original_table[, date_sale :=
    table_to_join[.SD, on=.(article, date_ad), roll=-Inf, x.date_sale]
]

output:

   article  date_sale    date_ad
1:       A 2010-04-09 2010-12-15
2:       A 2011-07-12 2012-08-20
3:       A 2012-05-22 2012-08-20
4:       B 2011-07-12 2013-12-01
5:       B 2014-02-02 2016-01-05
6:       B 2015-04-13 2016-01-05
7:       B 2016-08-12 2017-01-20

Rolling operation over rows by group in data.table

This is farily simple using

dt[, result := values[year == 2014] - values[year == 2017], by = id]
#   id values year result
#1:  a     11 2014     10
#2:  b     22 2014     20
#3:  c     33 2014     30
#4:  d     44 2014     40
#5:  a      1 2017     10
#6:  b      2 2017     20
#7:  c      3 2017     30
#8:  d      4 2017     40

Another option (less explicit) is with diff:

dt[order(-year), result := diff(values), by = id]

Apply a rolling function to a data.table in R with a custom window size

dt[, z := shift(frollapply(x, n = 5, FUN = min, fill = 0), n = 5, fill = 0)]
dt[, all.equal(y, z)] # TRUE

Rolling by Group in Data.Table R