Bounded Cumulative Sum

Bounded cumulative sum?

As mentioned in the comments, Rcpp is a good way to go.

cumsumBounded.cpp:

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector cumsumBounded(NumericVector x, double low, double high) {
NumericVector res(x.size());
double acc = 0;
for (int i=0; i < x.size(); ++i) {
acc += x[i];
if (acc < low) acc = low;
else if (acc > high) acc = high;
res[i] = acc;
}
return res;
}

Compile and use new function:

library(Rcpp)
sourceCpp(file="cumsumBounded.cpp")
foo <- c(100, -200, 400, 200)
cumsumBounded(foo, 0, 500)
# [1] 100 0 400 500

bounded Cumulative sum in Netezza

The SQL syntax is:

SELECT dept_id,
salary,
SUM(salary) OVER(PARTITION BY dept_id
ORDER BY salary ASC
rows between <N> preceding and current row) cum_sum
FROM Employees;

Is there a function to solve a rolling cumulative sum?

Use rollapply with a width of list(-seq(2, 12)) which means use offsets -2, -3, ..., -12 and the third argument says to sum the points at those offsets. partial = TRUE means if those offsets don't exist since we are too close to the start of the series then just use whatever is available. fill = NA says if there are no points at all at those offsets since we are too close to the start fill them in with NA.

library(data.table)
library(zoo)

DT <- data.table(DF)
DT[, roll := rollapply(RET.USD, list(-seq(2, 12)), sum, partial = TRUE, fill = NA), by = Id]

giving:

 > DT
Id RET.USD month year roll
1: 13082U -30.15 1 2001 NA
2: 13082U 29.46 2 2001 NA
3: 13082U -9.23 3 2001 -30.15
4: 13082U -30.68 4 2001 -0.69
5: 13082U 15.96 5 2001 -9.92
6: 13082U -3.64 6 2001 -40.60
7: 13082U -39.86 7 2001 -24.64
8: 13082U 13.55 8 2001 -28.28
9: 13082U 0.26 9 2001 -68.14
10: 13082U -1.10 10 2001 -54.59
11: 13082U 19.28 11 2001 -54.33
12: 13082U -0.56 12 2001 -55.43
13: 13082U -3.31 1 2002 -36.15
14: 13082U 0.48 2 2002 -6.56
15: 13082U 0.86 3 2002 -39.33

Note

DF <- structure(list(Id = c("13082U", "13082U", "13082U", "13082U", 
"13082U", "13082U", "13082U", "13082U", "13082U", "13082U", "13082U",
"13082U", "13082U", "13082U", "13082U"), RET.USD = c(-30.15,
29.46, -9.23, -30.68, 15.96, -3.64, -39.86, 13.55, 0.26, -1.1,
19.28, -0.56, -3.31, 0.48, 0.86), month = c(1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L), year = c(2001L, 2001L,
2001L, 2001L, 2001L, 2001L, 2001L, 2001L, 2001L, 2001L, 2001L,
2001L, 2002L, 2002L, 2002L)), class = "data.frame", row.names = c(NA,
-15L))


Related Topics



Leave a reply



Submit