Cumulative Sum That Resets When 0 Is Encountered

Cumulative sum that resets when 0 is encountered

Another base would be just

with(df, ave(b, cumsum(b == 0), FUN = cumsum))
## [1] 1 0 1 2

This will just divide column b to groups according to 0 appearances and compute the cumulative sum of b per these groups


Another solution using the latest data.table version (v 1.9.6+)

library(data.table) ## v 1.9.6+
setDT(df)[, whatiwant := cumsum(b), by = rleid(b == 0L)]
# campaign date b whatiwant
# 1: a jan 1 1
# 2: b feb 0 0
# 3: c march 1 1
# 4: d april 1 2

Some benchmarks per comments

set.seed(123)
x <- sample(0:1e3, 1e7, replace = TRUE)
system.time(res1 <- ave(x, cumsum(x == 0), FUN = cumsum))
# user system elapsed
# 1.54 0.24 1.81
system.time(res2 <- Reduce(function(x, y) if (y == 0) 0 else x+y, x, accumulate=TRUE))
# user system elapsed
# 33.94 0.39 34.85
library(data.table)
system.time(res3 <- data.table(x)[, whatiwant := cumsum(x), by = rleid(x == 0L)])
# user system elapsed
# 0.20 0.00 0.21

identical(res1, as.integer(res2))
## [1] TRUE
identical(res1, res3$whatiwant)
## [1] TRUE

SQL Server - Cumulative Sum that resets when 0 is encountered

In SQL Server 2008, you are severely limited because you cannot use analytic functions. The following is not efficient, but it will solve your problem:

with tg as (
select t.*, g.grp
from t cross apply
(select count(*) as grp
from t t2
where t2.pk <= t.pk and t2.pk = 0
) g
)
select tg.*, p.running_price
from tg cross apply
(select sum(tg2.price) as running_price
from tg tg2
where tg2.grp = tg.grp and tg2.pk <= tg.pk
) p;

Alas, prior to SQL Server 2012, the most efficient solution might involve cursors. In SQL Server 2012+, you simply do:

select t.*,
sum(price) over (partition by grp order by pk) as running_price
from (select t.*,
sum(case when price = 0 then 1 else 0 end) over (order by pk) as grp
from t
) t;

How to perform cumsum with reset at 0 in R?

Create a temporary group column to create a new group everytime you encounter a 0.

library(dplyr)

df %>%
group_by(ID, grp = cumsum(Counter == 0)) %>%
mutate(Cumulative = cumsum(Counter)) %>%
ungroup() %>%
select(-grp) -> result

result

# ID Counter Cumulative
# <chr> <int> <int>
#1 A 1 1
#2 A 0 0
#3 A 1 1
#4 A 1 2
#5 B 1 1
#6 B 0 0
#7 B 1 1

The same logic can be implemented in base R and data.table as :

df$Cumulative <- with(df, ave(Counter, ID, cumsum(Counter == 0), FUN = cumsum))

library(data.table)
setDT(df)[, Cumulative := cumsum(Counter), .(ID, cumsum(Counter == 0))]

data

df <- structure(list(ID = c("A", "A", "A", "A", "B", "B", "B"), Counter = c(1L, 
0L, 1L, 1L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -7L))

cumsum by participant and reset on 0 R

Does this work?

library(dplyr)
library(data.table)
df %>%
mutate(grp = rleid(Correct)) %>%
group_by(Participant, grp) %>%
mutate(Count = cumsum(Correct)) %>%
select(- grp)
# A tibble: 10 x 4
# Groups: Participant, grp [6]
grp Participant Correct Count
<int> <chr> <dbl> <dbl>
1 1 A 1 1
2 1 A 1 2
3 1 A 1 3
4 2 A 0 0
5 3 A 1 1
6 3 B 1 1
7 3 B 1 2
8 4 B 0 0
9 5 B 1 1
10 5 B 1 2

Toy data:

df <- data.frame(
Participant = c(rep("A", 5), rep("B", 5)),
Correct = c(1,1,1,0,1,1,1,0,1,1)
)

Python pandas cumsum with reset everytime there is a 0

You can use:

a = df != 0
df1 = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int)
print (df1)
a b
0 0 1
1 1 2
2 0 3
3 1 0
4 2 1
5 0 2

Count the number of NA values in a row - reset when 0

You can use max instead of cumsum in your attempt :

library(data.table)
setDT(df)[, whatiwant := max(Accumulated), by = rleid(b == 0L)]
df

# b Accumulated whatiwant
# 1: 1 1 3
# 2: 1 2 3
# 3: 1 3 3
# 4: 0 0 0
# 5: 1 1 4
# 6: 1 2 4
# 7: 1 3 4
# 8: 1 4 4
# 9: 0 0 0
#10: 0 0 0
#11: 0 0 0
#12: 1 1 2
#13: 1 2 2

Cumulative monthly sum with reset to zero at the beginning of each new month in pandas

Solution if multiple years is grouping by month periods by Series.dt.to_period:

df['Cumulative Value'] = df.groupby(df['Date'].dt.to_period('m'))['Value'].cumsum()

Solution if ony one year is possible use Series.dt.month:

df['Cumulative Value'] = df.groupby(df['Date'].dt.month)['Value'].cumsum() 

Also cumulative sum by default reset to 0, so not necessary add code for this.

Pandas - Cumulative Sum, with resets, using GroupBy

IIUC, you can do the cumsum after a groupby on the column group and every time the column value is eq to 0, you create a new group with cumsum as well.

test['consec'] = test.groupby(['group', test['value'].eq(0).cumsum()])['value'].cumsum()
test['max_c'] = test.groupby(['group'])['consec'].cummax()
print(test)
group idx value consec max_c
0 1 0 0 0 0
1 1 1 1 1 1
2 1 2 0 0 1
3 1 3 1 1 1
4 1 4 1 2 2
5 1 5 1 3 3
6 1 6 1 4 4
7 2 0 0 0 0
8 2 1 1 1 1
9 2 2 1 2 2
10 2 3 1 3 3
11 2 4 0 0 3
12 2 5 1 1 3
13 2 6 0 0 3

Cumulative sum that resets when the condition is no longer met

Note: Uses global variable

c = 0
def fun(x):
global c
if x['speed'] > 2.0:
c = 0
else:
c = x['timedelta']+c
return c

df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})

df['cum_sum']=df.apply(fun, axis=1)
            datetime    speed   timedelta   cum_sum
0 1-1-2019 19:30:00 0.5 0 0
1 1-1-2019 19:30:00 0.7 2 2
2 1-1-2019 19:30:00 0.1 2 4
3 1-1-2019 19:30:00 5.0 2 0
4 1-1-2019 19:30:00 25.0 2 0
5 1-1-2019 19:30:00 0.1 4 4
6 1-1-2019 19:30:00 0.1 7 11


Related Topics



Leave a reply



Submit