Is There a _Fast_ Way to Run a Rolling Regression Inside Data.Table

Is there a _fast_ way to run a rolling regression inside data.table?

Not as far as I know; data.table doesn't have any special features for rolling windows. Other packages already implement rolling functionality on vectors, so they can be used in the j of data.table. If they are not efficient enough, and no package has faster versions (?), then it's a case of writing faster versions yourself and (of course) contributing them: either to an existing package or creating your own.

Related questions (follow links in links) :

Using data.table to speed up rollapply

R data.table sliding window

Rolling regression over multiple columns in R

How do I speed up rolling regressions?

Here's how to do that with data.table, which should be the fastest way to do what you want. You first need to build a sigma function and then use rollaplyr with .SD.

set.seed(1)
library(data.table)
dt <- data.table(PERMNO=rep(LETTERS[1:3],each=13),
                 YearMonth=seq.Date(from=Sys.Date(),by="month",length.out =13),
                 Return=runif(39),VWReturn=runif(39))

#create sigma function
stdev <- function(x) sd(lm(x[, 1]~ x[, 2])$residuals)

#create new column with rollapply
dt[,roll_sd:=rollapplyr(.SD, 12, stdev, by.column = FALSE, fill = NA),
    by=.(PERMNO),.SDcols = c("Return", "VWReturn")]

    PERMNO  YearMonth     Return   VWReturn   roll_sd
 1:      A 2017-11-19 0.26550866 0.41127443        NA
 2:      A 2017-12-19 0.37212390 0.82094629        NA
 3:      A 2018-01-19 0.57285336 0.64706019        NA
 4:      A 2018-02-19 0.90820779 0.78293276        NA
 5:      A 2018-03-19 0.20168193 0.55303631        NA
 6:      A 2018-04-19 0.89838968 0.52971958        NA
 7:      A 2018-05-19 0.94467527 0.78935623        NA
 8:      A 2018-06-19 0.66079779 0.02333120        NA
 9:      A 2018-07-19 0.62911404 0.47723007        NA
10:      A 2018-08-19 0.06178627 0.73231374        NA
11:      A 2018-09-19 0.20597457 0.69273156        NA
12:      A 2018-10-19 0.17655675 0.47761962 0.3181427
13:      A 2018-11-19 0.68702285 0.86120948 0.3141638
14:      B 2017-11-19 0.38410372 0.43809711        NA
....

Rolling Regression Data Frame

Consider using the roll package.

library(magrittr); requireNamespace("roll")
ds <- readr::read_csv(
  "     Date, open.x, high.x, low.x, x_Close, volume.x, open.y, high.y, low.y, y_Close, volume.y
  2010-01-04,  57.32,  58.13, 57.32,   57.85,   442900,   6.61, 6.8400,  6.61,    6.83,   833100
  2010-01-05,  57.90,  58.33, 57.54,   58.20,   436900,   6.82, 7.1200,  6.80,    7.12,   904500
  2010-01-06,  58.20,  58.56, 58.01,   58.42,   850600,   7.05, 7.3800,  7.05,    7.27,   759800
  2010-01-07,  58.31,  58.41, 57.14,   57.90,   463600,   7.24, 7.3000,  7.06,    7.11,   557800
  2010-01-08,  57.45,  58.62, 57.45,   58.47,   206500,   7.08, 7.3500,  6.95,    7.29,   588100
  2010-01-11,  58.79,  59.00, 57.22,   57.73,   331900,   7.38, 7.4500,  7.17,    7.22,   450500
  2010-01-12,  57.20,  57.21, 56.15,   56.34,   428500,   7.15, 7.1900,  6.87,    7.00,   694700
  2010-01-13,  56.32,  56.66, 54.83,   56.56,   577500,   7.05, 7.1700,  6.98,    7.15,   528800
  2010-01-14,  56.51,  57.05, 55.37,   55.53,   368100,   7.08, 7.1701,  7.08,    7.11,   279900
  2010-01-15,  56.59,  56.59, 55.19,   55.84,   417900,   7.03, 7.0500,  6.95,    7.03,   407600"
)

runs <- roll::roll_lm(
  x         = as.matrix(ds$x_Close),
  y         = as.matrix(ds$y_Close), 
  width     = 5, 
  intercept = FALSE
)

# Nested in a named-column, within a matrix, within a list.
ds$beta <- runs$coefficients[, "x1"]

ds$beta 
#  [1]        NA        NA        NA        NA 0.1224813
#  [6] 0.1238653 0.1242478 0.1246279 0.1256553 0.1259121

Double-check the alignment of the variables in your dataset. x_Close is around 50, while y_Close is around 7. That might explain the small disparity between the expected 0.1229065 and the 0.1224813 value above.

Is There a _Fast_ Way to Run a Rolling Regression Inside Data.Table