Cumulative Sum That Resets When 0 Is Encountered

Cumulative sum that resets when 0 is encountered

Another base would be just

with(df, ave(b, cumsum(b == 0), FUN = cumsum))
## [1] 1 0 1 2

This will just divide column b to groups according to 0 appearances and compute the cumulative sum of b per these groups

Another solution using the latest data.table version (v 1.9.6+)

library(data.table) ## v 1.9.6+
setDT(df)[, whatiwant := cumsum(b), by = rleid(b == 0L)]
#    campaign  date b whatiwant
# 1:        a   jan 1         1
# 2:        b   feb 0         0
# 3:        c march 1         1
# 4:        d april 1         2

Some benchmarks per comments

set.seed(123)
x <- sample(0:1e3, 1e7, replace = TRUE)
system.time(res1 <- ave(x, cumsum(x == 0), FUN = cumsum))
# user  system elapsed 
# 1.54    0.24    1.81 
system.time(res2 <- Reduce(function(x, y) if (y == 0) 0 else x+y, x, accumulate=TRUE))
# user  system elapsed 
# 33.94    0.39   34.85 
library(data.table)
system.time(res3 <- data.table(x)[, whatiwant := cumsum(x), by = rleid(x == 0L)])
# user  system elapsed 
# 0.20    0.00    0.21 

identical(res1, as.integer(res2))
## [1] TRUE
identical(res1, res3$whatiwant)
## [1] TRUE

SQL Server - Cumulative Sum that resets when 0 is encountered

In SQL Server 2008, you are severely limited because you cannot use analytic functions. The following is not efficient, but it will solve your problem:

with tg as (
      select t.*, g.grp
      from t cross apply
           (select count(*) as grp
            from t t2
            where t2.pk <= t.pk and t2.pk = 0
           ) g
     )
select tg.*, p.running_price
from tg cross apply
     (select sum(tg2.price) as running_price
      from tg tg2
      where tg2.grp = tg.grp and tg2.pk <= tg.pk
     ) p;

Alas, prior to SQL Server 2012, the most efficient solution might involve cursors. In SQL Server 2012+, you simply do:

select t.*,
       sum(price) over (partition by grp order by pk) as running_price
from (select t.*,
             sum(case when price = 0 then 1 else 0 end) over (order by pk) as grp
      from t
     ) t;

How to perform cumsum with reset at 0 in R?

Create a temporary group column to create a new group everytime you encounter a 0.

library(dplyr)

df %>%
  group_by(ID, grp = cumsum(Counter == 0)) %>%
  mutate(Cumulative = cumsum(Counter)) %>%
  ungroup() %>%
  select(-grp) -> result

result

#  ID    Counter Cumulative
#  <chr>   <int>      <int>
#1 A           1          1
#2 A           0          0
#3 A           1          1
#4 A           1          2
#5 B           1          1
#6 B           0          0
#7 B           1          1

The same logic can be implemented in base R and data.table as :

df$Cumulative <- with(df, ave(Counter, ID, cumsum(Counter == 0), FUN = cumsum))

library(data.table)
setDT(df)[, Cumulative := cumsum(Counter), .(ID, cumsum(Counter == 0))]

data

df <- structure(list(ID = c("A", "A", "A", "A", "B", "B", "B"), Counter = c(1L, 
0L, 1L, 1L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -7L))

cumsum by participant and reset on 0 R

Does this work?

library(dplyr)
library(data.table)
df %>% 
  mutate(grp = rleid(Correct)) %>%
  group_by(Participant, grp) %>%
  mutate(Count = cumsum(Correct)) %>%
  select(- grp)
# A tibble: 10 x 4
# Groups:   Participant, grp [6]
     grp Participant Correct Count
   <int> <chr>         <dbl> <dbl>
 1     1 A                 1     1
 2     1 A                 1     2
 3     1 A                 1     3
 4     2 A                 0     0
 5     3 A                 1     1
 6     3 B                 1     1
 7     3 B                 1     2
 8     4 B                 0     0
 9     5 B                 1     1
10     5 B                 1     2

Toy data:

df <- data.frame(
  Participant = c(rep("A", 5), rep("B", 5)),
  Correct = c(1,1,1,0,1,1,1,0,1,1)
)

Python pandas cumsum with reset everytime there is a 0

You can use:

a = df != 0
df1 = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int)
print (df1)
   a  b
0  0  1
1  1  2
2  0  3
3  1  0
4  2  1
5  0  2

Count the number of NA values in a row - reset when 0

You can use max instead of cumsum in your attempt :

library(data.table)
setDT(df)[, whatiwant := max(Accumulated), by = rleid(b == 0L)]
df

#    b Accumulated whatiwant
# 1: 1           1         3
# 2: 1           2         3
# 3: 1           3         3
# 4: 0           0         0
# 5: 1           1         4
# 6: 1           2         4
# 7: 1           3         4
# 8: 1           4         4
# 9: 0           0         0
#10: 0           0         0
#11: 0           0         0
#12: 1           1         2
#13: 1           2         2

Cumulative monthly sum with reset to zero at the beginning of each new month in pandas

Solution if multiple years is grouping by month periods by Series.dt.to_period:

df['Cumulative Value'] = df.groupby(df['Date'].dt.to_period('m'))['Value'].cumsum()

Solution if ony one year is possible use Series.dt.month:

df['Cumulative Value'] = df.groupby(df['Date'].dt.month)['Value'].cumsum()

Also cumulative sum by default reset to 0, so not necessary add code for this.

Pandas - Cumulative Sum, with resets, using GroupBy

IIUC, you can do the cumsum after a groupby on the column group and every time the column value is eq to 0, you create a new group with cumsum as well.

test['consec'] = test.groupby(['group', test['value'].eq(0).cumsum()])['value'].cumsum()
test['max_c'] = test.groupby(['group'])['consec'].cummax()
print(test)
    group  idx  value  consec  max_c
0       1    0      0       0      0
1       1    1      1       1      1
2       1    2      0       0      1
3       1    3      1       1      1
4       1    4      1       2      2
5       1    5      1       3      3
6       1    6      1       4      4
7       2    0      0       0      0
8       2    1      1       1      1
9       2    2      1       2      2
10      2    3      1       3      3
11      2    4      0       0      3
12      2    5      1       1      3
13      2    6      0       0      3

Cumulative sum that resets when the condition is no longer met

Note: Uses global variable

c = 0
def fun(x):
    global c    
    if x['speed'] > 2.0:
        c = 0
    else:
        c = x['timedelta']+c
    return c

df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
    'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})

df['cum_sum']=df.apply(fun, axis=1)

            datetime    speed   timedelta   cum_sum
0   1-1-2019 19:30:00   0.5     0           0
1   1-1-2019 19:30:00   0.7     2           2
2   1-1-2019 19:30:00   0.1     2           4
3   1-1-2019 19:30:00   5.0     2           0
4   1-1-2019 19:30:00   25.0    2           0
5   1-1-2019 19:30:00   0.1     4           4
6   1-1-2019 19:30:00   0.1     7           11

Cumulative Sum That Resets When 0 Is Encountered