Rolling by group in data.table R
When you use dt$return
the whole data.table
is picked internally within the groups. Just use the column you need in the function definition and it will work fine:
#use the column instead of the data.table
zoo_fun <- function(column, N) {
(rollapply(column + 1, N, FUN=prod, fill=NA, align='right') - 1)
}
#now it works fine
test[, momentum := zoo_fun(return, 3), by = sec]
As a separate note, you should probably not use return
as a column or variable name.
Out:
> test
return sec momentum
1: 0.1 A NA
2: 0.1 A NA
3: 0.1 A 0.331
4: 0.1 A 0.331
5: 0.1 A 0.331
6: 0.2 B NA
7: 0.2 B NA
8: 0.2 B 0.728
9: 0.2 B 0.728
10: 0.2 B 0.728
Rolling Mean By Group Dplyr/data.table
you could use map
from the purrr
package and apply it on 1:n()
:
df = df %>%
na.omit() %>%
group_by(ticker) %>%
mutate(avg10 = map_dbl(1:n(), ~mean(lag_close[(max(.x-9, 1)):.x], na.rm =T))
Of course you have to decide what should happen with the first 9 rows where there are fewer than 10 observations. In my solution the rows 1 to 9 contain the mean of the last 1 to 9 observations.
Apply a rolling function by group in r (zoo, data.table)
Add fill=NA
to rollapply
. This will ensure that a vector of length 50 (rather than 49) is returned, with NA
as the first value (since align="right"
), avoiding recycling.
A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species stdev
1 5.1 3.5 1.4 0.2 setosa NA
2 4.9 3.0 1.4 0.2 setosa 0.00000000
3 4.7 3.2 1.3 0.2 setosa 0.00000000
...
51 7.0 3.2 4.7 1.4 versicolor NA
52 6.4 3.2 4.5 1.5 versicolor 0.07071068
53 6.9 3.1 4.9 1.5 versicolor 0.00000000
...
101 6.3 3.3 6.0 2.5 virginica NA
102 5.8 2.7 5.1 1.9 virginica 0.42426407
103 7.1 3.0 5.9 2.1 virginica 0.14142136
Rolling join with group ID in both tables
We can use a rolling join with data.table
library(data.table)
table_to_join[, date_ad := date_sale][original_table,
on = .(article, date_sale = date_ad), roll = -Inf]
# article date_sale date_ad
#1: A 2010-04-09 2010-12-15
#2: A 2011-07-12 2012-08-20
#3: B 2015-04-13 2016-01-05
#4: B 2016-08-12 2017-01-20
I think OP was confused with the naming of the output columns. Using an update by reference should be clearer:
table_to_join[, date_ad := date_sale]
original_table[, date_sale :=
table_to_join[.SD, on=.(article, date_ad), roll=-Inf, x.date_sale]
]
output:
article date_sale date_ad
1: A 2010-04-09 2010-12-15
2: A 2011-07-12 2012-08-20
3: A 2012-05-22 2012-08-20
4: B 2011-07-12 2013-12-01
5: B 2014-02-02 2016-01-05
6: B 2015-04-13 2016-01-05
7: B 2016-08-12 2017-01-20
Rolling operation over rows by group in data.table
This is farily simple using
dt[, result := values[year == 2014] - values[year == 2017], by = id]
# id values year result
#1: a 11 2014 10
#2: b 22 2014 20
#3: c 33 2014 30
#4: d 44 2014 40
#5: a 1 2017 10
#6: b 2 2017 20
#7: c 3 2017 30
#8: d 4 2017 40
Another option (less explicit) is with diff
:
dt[order(-year), result := diff(values), by = id]
Apply a rolling function to a data.table in R with a custom window size
dt[, z := shift(frollapply(x, n = 5, FUN = min, fill = 0), n = 5, fill = 0)]
dt[, all.equal(y, z)] # TRUE
Related Topics
Separate a Column into Multiple Columns Using Tidyr::Separate with Sep=""
How to Color Bar Plots When Using ..Prop.. in Ggplot
R Shiny - Ui.R Seems to Not Recognize a Dataframe Read by Server.R
How to Create a Binary Vector with 1 If Elements Are Part of the Same Vector
Wordcloud Package: Get "Error in Strwidth(…):Invalid 'Cex' Value"
Split on Factor, Sapply, and Lm
How to Scrape Website with Form Using Rvest
Object 'C_Stri_Join' Not Found - Using Knitr in Rstudio
Read Column Names as Date Format
Filtering Multiple Columns with Str_Detect
Out of Order Text Labels on Stack Bar Plot (Ggplot)
R Shiny - Checkboxes and Action Button Combination Issue
Ggplot2: Have Common Facet Bar in Outer Facet Panel in 3-Way Plot
How Can One Mix 2 or More Color Palettes to Show a Combined Color Value
Add a Series of Elements in Different Locations Within a Vector