Rolling by Group in Data.Table R

Rolling by group in data.table R

When you use dt$return the whole data.table is picked internally within the groups. Just use the column you need in the function definition and it will work fine:

#use the column instead of the data.table
zoo_fun <- function(column, N) {
(rollapply(column + 1, N, FUN=prod, fill=NA, align='right') - 1)
}

#now it works fine
test[, momentum := zoo_fun(return, 3), by = sec]

As a separate note, you should probably not use return as a column or variable name.

Out:

> test
return sec momentum
1: 0.1 A NA
2: 0.1 A NA
3: 0.1 A 0.331
4: 0.1 A 0.331
5: 0.1 A 0.331
6: 0.2 B NA
7: 0.2 B NA
8: 0.2 B 0.728
9: 0.2 B 0.728
10: 0.2 B 0.728

Rolling Mean By Group Dplyr/data.table

you could use map from the purrr package and apply it on 1:n():

df = df %>% 
na.omit() %>%
group_by(ticker) %>%
mutate(avg10 = map_dbl(1:n(), ~mean(lag_close[(max(.x-9, 1)):.x], na.rm =T))

Of course you have to decide what should happen with the first 9 rows where there are fewer than 10 observations. In my solution the rows 1 to 9 contain the mean of the last 1 to 9 observations.

Apply a rolling function by group in r (zoo, data.table)

Add fill=NA to rollapply. This will ensure that a vector of length 50 (rather than 49) is returned, with NA as the first value (since align="right"), avoiding recycling.

A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species      stdev
1 5.1 3.5 1.4 0.2 setosa NA
2 4.9 3.0 1.4 0.2 setosa 0.00000000
3 4.7 3.2 1.3 0.2 setosa 0.00000000
...
51 7.0 3.2 4.7 1.4 versicolor NA
52 6.4 3.2 4.5 1.5 versicolor 0.07071068
53 6.9 3.1 4.9 1.5 versicolor 0.00000000
...
101 6.3 3.3 6.0 2.5 virginica NA
102 5.8 2.7 5.1 1.9 virginica 0.42426407
103 7.1 3.0 5.9 2.1 virginica 0.14142136

Rolling join with group ID in both tables

We can use a rolling join with data.table

library(data.table)
table_to_join[, date_ad := date_sale][original_table,
on = .(article, date_sale = date_ad), roll = -Inf]
# article date_sale date_ad
#1: A 2010-04-09 2010-12-15
#2: A 2011-07-12 2012-08-20
#3: B 2015-04-13 2016-01-05
#4: B 2016-08-12 2017-01-20

I think OP was confused with the naming of the output columns. Using an update by reference should be clearer:

table_to_join[, date_ad := date_sale]
original_table[, date_sale :=
table_to_join[.SD, on=.(article, date_ad), roll=-Inf, x.date_sale]
]

output:

   article  date_sale    date_ad
1: A 2010-04-09 2010-12-15
2: A 2011-07-12 2012-08-20
3: A 2012-05-22 2012-08-20
4: B 2011-07-12 2013-12-01
5: B 2014-02-02 2016-01-05
6: B 2015-04-13 2016-01-05
7: B 2016-08-12 2017-01-20

Rolling operation over rows by group in data.table

This is farily simple using

dt[, result := values[year == 2014] - values[year == 2017], by = id]
# id values year result
#1: a 11 2014 10
#2: b 22 2014 20
#3: c 33 2014 30
#4: d 44 2014 40
#5: a 1 2017 10
#6: b 2 2017 20
#7: c 3 2017 30
#8: d 4 2017 40

Another option (less explicit) is with diff:

dt[order(-year), result := diff(values), by = id]

Apply a rolling function to a data.table in R with a custom window size

dt[, z := shift(frollapply(x, n = 5, FUN = min, fill = 0), n = 5, fill = 0)]
dt[, all.equal(y, z)] # TRUE


Related Topics



Leave a reply



Submit