Rolling Window Over Irregular Time Series

Rolling window function for irregular time series that can handle duplicates

This can be solved by grouping in a non-equi join to aggregate over a rolling window of length k, filtering for k consecutive years, and an update join:

library(data.table)
k <- 3L
# group by join parameters of a non-equi join
mDT <- setDT(DT)[.(grp = grp, upper = yr, lower = yr - k),
on = .(grp, yr <= upper, yr > lower),
.(uniqueN(x.yr), mean(nr)), by = .EACHI]
# update join with filtered intermediate result
DT[mDT[V1 == k], on = .(grp, yr), paste0("nr_roll_period_", k) := V2]
DT

which returns OP's expected result:

    grp  nr   yr nr_roll_period
1: A 1.0 2009 NA
2: A 2.0 2009 NA
3: A 1.5 2009 NA
4: A 1.0 2010 NA
5: B 3.0 2009 NA
6: B 2.0 2010 NA
7: B NA 2011 NA
8: C 3.0 2014 NA
9: C 3.0 2019 NA
10: C 3.0 2020 NA
11: C 4.0 2021 3.333333

The intermediate result mDT contains the rolling mean V2 over k periods and the count of unique/distinct years V1 within each period. It is created by a non-equi join of DT with a data.table containing the upper and lower bounds which is created on-the-fly by .(grp = grp, upper = yr, lower = yr - k).

mDT
    grp   yr   yr V1       V2
1: A 2009 2006 1 1.500000
2: A 2009 2006 1 1.500000
3: A 2009 2006 1 1.500000
4: A 2010 2007 2 1.375000
5: B 2009 2006 1 3.000000
6: B 2010 2007 2 2.500000
7: B 2011 2008 3 NA
8: C 2014 2011 1 3.000000
9: C 2019 2016 1 3.000000
10: C 2020 2017 2 3.000000
11: C 2021 2018 3 3.333333

This is filtered for rows which contain exactly k distinct years:

mDT[V1 == k]
   grp   yr   yr V1       V2
1: B 2011 2008 3 NA
2: C 2021 2018 3 3.333333

Finally, this is joined with DT to append the new column to DT.

Note, that mean() returns NA by default if there is an NA in the input data.

Data

library(data.table)
DT <- fread(text = "rn grp nr yr
1: A 1.0 2009
2: A 2.0 2009
3: A 1.5 2009
4: A 1.0 2010
5: B 3.0 2009
6: B 2.0 2010
7: B NA 2011
8: C 3.0 2014
9: C 3.0 2019
10: C 3.0 2020
11: C 4.0 2021", drop = 1L)

R Rolling average from irregular time series

Calcuations within a sliding or rolling window of an irregular time series can be solved by data.table's ability to aggregate in a non-equi join.

There are many similar questions, e.g., r calculating rolling average with window based on value (not number of rows or date/time variable) or Rolling regression on irregular time series.

However, this question is different and thus deserves an answer on its own. From OP's own answer it can be concluded that the OP is looking for a centred rolling window. In addition, the rolling mean is to be computed for several columns.

library(data.table)
cols <- c("value2", "value3")
setDT(df)[SJ(year = (min(year) + 2):(max(year) - 2))[, c("start", "end") := .(year - 2, year + 2)],
on = .(year >= start, year < end),
c(.(year = i.year), lapply(.SD, mean)), .SDcols = cols, by = .EACHI][, -(1:2)]
   year      value2      value3
1: 2002 0.57494219 -0.53001134
2: 2003 0.33925292 0.75541896
3: 2004 -0.05834453 0.23987209
4: 2005 0.17031099 0.13074666
5: 2006 0.05272739 0.09297215
6: 2007 -0.12935805 -0.38780964
7: 2008 0.19716437 -0.11587017

The result is identical to OP's own result rmeans.

Data

set.seed(123)   # ensure reproducible sample data
df <- data.frame(
year = rep(2000:2010, c(3, 1, 0, 0, 4, 3, 3, 1, 2, 6, 8)),
value1 = rnorm(31), value2 = rnorm(31), value3 = rnorm(31))

Rollapply with different rolling window on each vector of time series

1) Convert to a wide form zoo object z and then to a list of zoo objects L, one per column of z, apply rollfun to each component of L creating a list of zoo objects and then merge back into a wide form zoo object zroll and either use that or optionally convert to long form data frame droll.

library(zoo)

z <- read.zoo(dat, split = "fact")
L <- as.list(z)

rollfun <- function(x) rollapplyr(x, length(na.omit(x)) - 252, mean)
zroll <- do.call("merge", Map(rollfun, L))
droll <- fortify.zoo(zroll, melt = TRUE)

2) This could also be expressed as a pipeline where rollfun is from above.

droll2 <- dat |>
read.zoo(split = "fact") |>
as.list() |>
Map(f = rollfun) |>
do.call(what = "merge") |>
fortify.zoo(melt = TRUE)

3) With dplyr

library(dplyr, exclude = c("lag", "filter"))
library(zoo)

dat %>%
group_by(fact) %>%
mutate(roll = rollapplyr(value, n() - 252, mean, fill = NA)) %>%
ungroup



Related Topics



Leave a reply



Submit