Cumulative Sum with Lag

Cumulative sum with lag

You can use 0 for the first element, and remove the last element using head(, -1)

transform(df, previous_comments=ave(comment_count, member_id, 
FUN = function(x) cumsum(c(0, head(x, -1)))))
# member_id entry_id comment_count timestamp previous_comments
#1 1 a 4 2008-06-09 12:41:00 0
#2 1 b 1 2008-07-14 18:41:00 4
#3 1 c 3 2008-07-17 15:40:00 5
#4 2 d 12 2008-06-09 12:41:00 0
#5 2 e 50 2008-09-18 10:22:00 12
#6 3 f 0 2008-10-03 13:36:00 0

cumulative sum by ID with lag

With dplyr -

df %>% 
group_by(id) %>%
mutate(sum = lag(cumsum(amount), default = 0)) %>%
ungroup()

# A tibble: 7 x 3
id amount sum
<dbl> <dbl> <dbl>
1 1 100 0
2 1 20 100
3 1 150 120
4 2 60 0
5 2 100 60
6 1 30 270
7 2 40 160

Thanks to @thelatemail here's the data.table version -

df[, sum := cumsum(shift(amount, fill=0)), by=id]

How to do cumulative sum on lagged values in python?

et voila!

s.rolling(window = 2).sum()

you can further shift / lag your input vector to get different starting points.

s.shift(1).rolling(window = 2).sum()

Sum multiple columns with lag function SQL

I think you want a cumulative sum:

select t.*,
sum(lead_time_days) over (partition by clientno
order by startdate
) as cumulative_lead_time_days
from t

Calculating the sum of a column dynamically using lag function on the same un-computed column?

You can use the sum window function:

select *,
sum(rooms) over(order by Day) as dynamic_sum
from table_name;

Fiddle



Related Topics



Leave a reply



Submit