Efficient calculation of matrix cumulative standard deviation in r
You could use cumsum
to compute necessary sums from direct formulas for variance/sd to vectorized operations on matrix:
cumsd_mod <- function(mat) {
cum_var <- function(x) {
ind_na <- !is.na(x)
nn <- cumsum(ind_na)
x[!ind_na] <- 0
cumsum(x^2) / (nn-1) - (cumsum(x))^2/(nn-1)/nn
}
v <- sqrt(apply(mat,2,cum_var))
v[is.na(mat) | is.infinite(v)] <- NA
v
}
just for comparison:
set.seed(2765374)
X <- matrix(rnorm(1000),100,10)
X[cbind(1:10,1:10)] <- NA # to have some NA's
all.equal(cumsd(X),cumsd_mod(X))
# [1] TRUE
And about timing:
X <- matrix(rnorm(100000),1000,100)
system.time(cumsd(X))
# user system elapsed
# 7.94 0.00 7.97
system.time(cumsd_mod(X))
# user system elapsed
# 0.03 0.00 0.03
Calculate cumulative standard deviation
Use TTR::runSD
with cumulative=TRUE
.
library(TTR)
x <- xts(test.df[,2],test.df[,1])
runSD(x, n=1, cumulative=TRUE)
Surprisingly Slow Standard Deviation in R
You might also try an algorithm that updates the standard deviation (well, actually, the sum of squares of differences from the mean) as you go. On my system this reduces the time from ~0.8 seconds to ~0.002 seconds.
n <- length(x)
m <- cumsum(x)/(1:n)
m1 <- c(NA,m[1:(n-1)])
ssd <- (x-m)*(x-m1)
v <- c(0,cumsum(ssd[-1])/(1:(n-1)))
z <- sqrt(v)
See http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance for details.
Also see the answers to this question: Efficient calculation of matrix cumulative standard deviation in r
Efficient calculation of var-covar matrix in R
@F. Privé's Rcpp
implementation is a good starting place, but we can do better. You will notice in the main algorithm supplied by the OP that there are many replicated fairly expensive calculations. Observe:
OPalgo <- function(m, p, ind1, n) {
vcov <- matrix(0, nrow = n + 1L, ncol = n + 1)
for (i in 0L:n) {
for (j in i:n) {
## lower and upper range for the first & second multiplicand
print(paste(c((1L + (j - i)),":",(periods - i),"
",1L,":",(periods - j)), collapse = ""))
vcov[j + 1L, i + 1L] <-
sum(mat[, (1L + (j - i)):(periods - i)] *
mat[, 1L:(periods - j)]) /
(ind * (periods - j) - 1)
}
}
vcov
}
OPalgo(mat, periods, ind, n_lags)
[1] "1:70 1:70" ## contains "1:65 1:65"
[1] "2:70 1:69"
[1] "3:70 1:68"
[1] "4:70 1:67"
[1] "5:70 1:66"
[1] "6:70 1:65"
[1] "1:69 1:69" ## contains "1:65 1:65"
[1] "2:69 1:68"
[1] "3:69 1:67"
[1] "4:69 1:66"
[1] "5:69 1:65"
[1] "1:68 1:68" ## contains "1:65 1:65"
[1] "2:68 1:67"
[1] "3:68 1:66"
[1] "4:68 1:65"
[1] "1:67 1:67" ## contains "1:65 1:65"
[1] "2:67 1:66"
[1] "3:67 1:65"
[1] "1:66 1:66" ## contains "1:65 1:65"
[1] "2:66 1:65"
[1] "1:65 1:65"
As you can see, the product mat[,1:65] * mat[,1:65]
is performed 6 times above. The only difference between the first occurrence and the last occurrence is that the first occurrence has an additional 5 columns. So instead of computing:
sum(mat[ , 1:70] * mat[ , 1:70])
sum(mat[ , 1:69] * mat[ , 1:69])
sum(mat[ , 1:68] * mat[ , 1:68])
sum(mat[ , 1:67] * mat[ , 1:67])
sum(mat[ , 1:66] * mat[ , 1:66])
sum(mat[ , 1:65] * mat[ , 1:65])
We can compute preCalc[1] <- sum(mat[ , 1:65] * mat[ , 1:65])
one time and use this in the other 5 calculations like so:
preCalc[1] + sum(mat[ , 66:70] * mat[ , 66:70])
preCalc[1] + sum(mat[ , 66:69] * mat[ , 66:69])
preCalc[1] + sum(mat[ , 66:68] * mat[ , 66:68])
preCalc[1] + sum(mat[ , 66:67] * mat[ , 66:67])
preCalc[1] + sum(mat[ , 66:66] * mat[ , 66:66])
In each of the above, we have reduce the number of multiplications by 90000 * 65 = 5,850,000
and the number of additions by 5,850,000 - 1 = 5,849,999
for a total of 11,699,999
arithmetic operations saved. The function below achieves this very thing.
fasterAlgo <- function(m, p, ind1, n) {
vcov <- matrix(0, nrow = n + 1L, ncol = n + 1)
preCals <- vapply(1:(n + 1L), function(x) sum(m[ , x:(p - n + x - 2L)] *
m[ , 1L:(p - n - 1L)]), 42.42)
for (i in 0L:n) {
for (j in i:n) {
myNum <- preCals[1L + j - i] + sum(m[, (p - n + j - i):(p - i)] * m[, (p - n):(p - j)])
vcov[j + 1L, i + 1L] <- myNum / (ind * (p - j) - 1)
}
}
vcov
}
## outputs same results
all.equal(OPalgo(mat, periods, ind, n_lags), fasterAlgo(mat, periods, ind, n_lags))
[1] TRUE
Benchmarks:
## I commented out the print statements of the OPalgo before benchmarking
library(microbenchmark)
microbenchmark(OP = OPalgo(mat, periods, ind, n_lags),
fasterBase = fasterAlgo(mat, periods, ind, n_lags),
RcppOrig = compute_vcov(mat, n_lags), times = 5)
Unit: milliseconds
expr min lq mean median uq max neval cld
OP 2775.6110 2780.7207 2843.6012 2784.976 2899.7621 2976.9356 5 c
fasterBase 863.3897 863.9681 865.5576 865.593 866.7962 868.0409 5 b
RcppOrig 160.1040 161.8922 162.0153 162.235 162.4756 163.3697 5 a
As you can see, with this modification we see at least a 3 fold improvement but the Rcpp
is still much faster. Let's implement the above concept in Rcpp
.
// [[Rcpp::export]]
NumericMatrix compute_vcov2(const NumericMatrix& mat, int n_lags) {
NumericMatrix vcov(n_lags + 1, n_lags + 1);
std::vector<double> preCalcs;
preCalcs.reserve(n_lags + 1);
double myCov;
int i, j, k1, k2, l;
int n = mat.nrow();
int m = mat.ncol();
for (i = 0; i <= n_lags; i++) {
myCov = 0;
for (k1 = i, k2 = 0; k2 < (m - n_lags - 1); k1++, k2++) {
for (l = 0; l < n; l++) {
myCov += mat(l, k1) * mat(l, k2);
}
}
preCalcs.push_back(myCov);
}
for (i = 0; i <= n_lags; i++) {
for (j = i; j <= n_lags; j++) {
myCov = preCalcs[j - i];
for (k1 = m - n_lags + j - i - 1, k2 = m - n_lags - 1; k2 < (m - j); k1++, k2++) {
for (l = 0; l < n; l++) {
myCov += mat(l, k1) * mat(l, k2);
}
}
myCov /= n * (m - j) - 1;
vcov(i, j) = vcov(j, i) = myCov;
}
}
return vcov;
}
## gives same results
all.equal(compute_vcov2(mat, n_lags), compute_vcov(mat, n_lags))
[1] TRUE
New benchmarks:
microbenchmark(OP = OPalgo(mat, periods, ind, n_lags),
fasterBase = fasterAlgo(mat, periods, ind, n_lags),
RcppOrig = compute_vcov(mat, n_lags),
RcppModified = compute_vcov2(mat, n_lags), times = 5)
Unit: milliseconds
expr min lq mean median uq max neval cld
OP 2785.4789 2786.67683 2811.02528 2789.37719 2809.61270 2883.98073 5 d
fasterBase 866.5601 868.25555 888.64418 869.31796 870.92308 968.16417 5 c
RcppOrig 160.3467 161.37992 162.74899 161.73009 164.38653 165.90174 5 b
RcppModified 51.1641 51.67149 52.87447 52.56067 53.06273 55.91334 5 a
Now the enhanced Rcpp
solution is around 3x faster the original Rcpp
solution and around 50x faster than the original algorithm provided by the OP.
Update
We can do even better. We can reverse the ranges of the indices i/j so as to continuously update preCalcs
. This allows up to only compute the product of one new column every iteration. This really comes into play as n_lags
increases. Observe:
// [[Rcpp::export]]
NumericMatrix compute_vcov3(const NumericMatrix& mat, int n_lags) {
NumericMatrix vcov(n_lags + 1, n_lags + 1);
std::vector<double> preCalcs;
preCalcs.reserve(n_lags + 1);
int i, j, k1, k2, l;
int n = mat.nrow();
int m = mat.ncol();
for (i = 0; i <= n_lags; i++) {
preCalcs.push_back(0);
for (k1 = i, k2 = 0; k2 < (m - n_lags); k1++, k2++) {
for (l = 0; l < n; l++) {
preCalcs[i] += mat(l, k1) * mat(l, k2);
}
}
}
for (i = n_lags; i >= 0; i--) { ## reverse range
for (j = n_lags; j >= i; j--) { ## reverse range
vcov(i, j) = vcov(j, i) = preCalcs[j - i] / (n * (m - j) - 1);
if (i > 0 && i > 0) {
for (k1 = m - i, k2 = m - j; k2 <= (m - j); k1++, k2++) {
for (l = 0; l < n; l++) {
## updating preCalcs vector
preCalcs[j - i] += mat(l, k1) * mat(l, k2);
}
}
}
}
}
return vcov;
}
all.equal(compute_vcov(mat, n_lags), compute_vcov3(mat, n_lags))
[1] TRUE
Rcpp
benchmarks only:
n_lags <- 50L
microbenchmark(RcppOrig = compute_vcov(mat, n_lags),
RcppModified = compute_vcov2(mat, n_lags),
RcppExtreme = compute_vcov3(mat, n_lags), times = 5)
Unit: milliseconds
expr min lq mean median uq max neval cld
RcppOrig 7035.7920 7069.7761 7083.4961 7070.3395 7119.028 7122.5446 5 c
RcppModified 3608.8986 3645.8585 3653.0029 3654.7209 3663.716 3691.8202 5 b
RcppExtreme 324.8252 330.7381 332.9657 333.5919 335.168 340.5054 5 a
The newest implementation is now over 20x faster than the original Rcpp
version and well over 300x faster than the original algorithm when n-lags
is large.
R: calculating population standard deviation with NA
We can use na.rm=TRUE
in the mean
and sum
to account for the NA elements.
pop.sd<-function(x){sqrt(sum((x-mean(x, na.rm=TRUE))^2,
na.rm=TRUE)/sum(!is.na(x)))}
apply(mf2, 1, pop.sd)
#[1] 25.152866 13.500000 7.586538 31.443070 0.000000 32.967998
This should also give the same result for 'mf1'
apply(mf1,1,pop.sd)
#[1] 25.152866 12.498889 7.586538 31.443070 22.156012 32.967998
Instead of looping over the rows, we can also us the vectorized rowSums
and rowMeans
sqrt(rowSums((mf1-rowMeans(mf1, na.rm=TRUE))^2, na.rm=TRUE)/ncol(mf1))
#[1] 25.152866 12.498889 7.586538 31.443070 22.156012 32.967998
sqrt(rowSums((mf2-rowMeans(mf2, na.rm=TRUE))^2, na.rm=TRUE)/ncol(mf2))
#[1] 25.152866 11.022704 7.586538 31.443070 0.000000 32.967998
R: using apply.fromstart to calculate returns and standard deviation
Use the very efficient vectorized base R function cumprod
for your first desired result. While the second result could be achieved (less efficiently) using a simple *apply
loop
If you want to keep zoo
class, do
cumprod(hourlyData$Position)
# 1 2 3 4 5 6
# 1.0000000 0.9929392 0.9224669 0.9754125 1.0673348 1.1547867
Otherwise
cumprod(as.numeric(hourlyData$Position))
## [1] 1.0000000 0.9929392 0.9224669 0.9754125 1.0673348 1.1547867
For sd
(as proposed by @akrun) (used vapply
instead of sapply
in order to "squeeze" maximum performance out of it)
vapply(seq_len(nrow(hourlyData)), function(i) sd(hourlyData$Position[1:i]), FUN.VALUE = double(1))
# [1] NA 0.004992723 0.039097989 0.052519398 0.063598345 0.063156702
Cumulative sum over matrix diagonals
From the example it seems that every diagonal is all zeros or else is a sequence of ones followed by zeros. We assume that that is always the case.
First form a function cum
which takes a diagonal x
and outputs a vector of zeros the same length except that position sum(x)
is to be set to sum(x)
.
Then apply that function across diagonals using ave
. row(m1)-col(m1)
is constant on diagonals and can be used for grouping.
cum <- function(x, s = sum(x)) replace(0 * x, s, s)
ave(m1, row(m1) - col(m1), FUN = cum)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 0 0 0 0
## [2,] 0 0 0 0 0
## [3,] 0 2 0 0 3
## [4,] 0 0 0 0 0
## [5,] 0 0 0 0 0
If the sequence of ones on a diaongal need not start at the beginning of the diagonal but it is still true that there is only one sequence of ones at most on each diagonal then use this in place of cum
above:
cum <- function(x, s = sum(x)) replace(0 * x, s + which.max(x) - 1, s)
If there can be more than one sequence of ones on a diagonal then use this in place of cum
above:
library(data.table)
cum <- function(x) {
ave(x, rleid(x), FUN = function(x, s = sum(x)) replace(0 * x, s, s))
}
Related Topics
Ggplot X-Axis Labels with All X-Axis Values
Counting Non Nas in a Data Frame; Getting Answer as a Vector
An Na in Subsetting a Data.Frame Does Something Unexpected
Splitting a Data Frame into Equal Parts
How to Create Md5 Hash of a Column in R
Combining Duplicated Rows in R and Adding New Column Containing Ids of Duplicates
Replace Missing Values (Na) in One Data Set with Values from Another Where Columns Match
Error with Select Function from Dplyr
Split a Vector by Its Sequences
R - How to Make Barplot Plot Zeros for Missing Values Over the Data Range
Determine the Number of Na Values in a Column
Changing Font in PDF Produced by Rmarkdown
Sum of Rows Based on Column Value
How to Fix 'Tar: Failed to Set Default Locale' Error