Cumsum Reset at Certain Values

How to reset cumulative sum per group when a certain column is 0 in pandas

For the given resetting condition, use groupby.cumsum to create a Reset grouper that tells us when Quantity hits 0 within each Group:

condition = df.Quantity.eq(0)
df['Reset'] = condition.groupby(df.Group).cumsum()

#   Group  Quantity  Value  Cumulative_sum  Reset
# 0     A        10    200             200      0
# 1     B         5    300             300      0
# 2     A         1     50             250      0
# 3     A         0    100               0      1
# 4     C         5    400             400      0
# 5     A        10    300             300      1
# 6     B        10    200             500      0
# 7     A        15    350             650      1

mask the Value column whenever the resetting condition is met and use another groupby.cumsum on both Group and Reset:

df['Cumul'] = df.Value.mask(condition, 0).groupby([df.Group, df.Reset]).cumsum()

#   Group  Quantity  Value  Cumulative_sum  Reset  Cumul
# 0     A        10    200             200      0    200
# 1     B         5    300             300      0    300
# 2     A         1     50             250      0    250
# 3     A         0    100               0      1      0
# 4     C         5    400             400      0    400
# 5     A        10    300             300      1    300
# 6     B        10    200             500      0    500
# 7     A        15    350             650      1    650

Resetting Cumulative Sum once a value is reached and set a flag to 1

"Ordinary" cumsum() is here useless, as this function "doesn't know"
where to restart summation.

You can do it with the following custom function:

def myCumSum(x, thr):
    if myCumSum.prev >= thr:
        myCumSum.prev = 0
    myCumSum.prev += x
    return myCumSum.prev

This function is "with memory" (from the previous call) - prev, so there
is a way to "know" where to restart.

To speed up the execution, define a vectorized version of this function:

myCumSumV = np.vectorize(myCumSum, otypes=[np.int], excluded=['thr'])

Then execute:

threshold = 40
myCumSum.prev = 0  # Set the "previous" value
# Replace "a" column with your cumulative sum
df.a = myCumSumV(df.a.values, threshold)
df['flag'] = df.a.ge(threshold).astype(int)  # Compute "flag" column

The result is:

     a  b  flag
0    5  1     0
1   11  1     0
2   41  1     1
3  170  0     1
4    5  1     0
5   15  1     0

Pandas Cumsum conditional reset

df = pd.DataFrame({'Size':[8,8,8,8,7,6,7,6,5,2]})

ls = []  
cumsum = 0
last_reset = 0
for _, row in df.iterrows():
    if cumsum + row.Size <= 16:
        cumsum += row.Size
    else:
        last_reset = cumsum
        cumsum = row.Size
    ls.append(cumsum)

df['cumsum'] = ls

Result:

    Size    cumsum
0   8       8
1   8       16
2   8       8
3   8       16
4   7       7
5   6       13
6   7       7
7   6       13
8   5       5
9   2       7

cumsum with a condition to restart in R

You may use cumsum to create groups as well.

library(dplyr)

df <- df %>%
  group_by(group = cumsum(dplyr::lag(port == 0, default = 0))) %>%
  mutate(cumsum_G = cumsum(G)) %>%
  ungroup

df

#  inv   ass    port     G group cumsum_G
#  <chr> <chr> <int> <int> <dbl>    <int>
#1 i     x         2     1     0        1
#2 i     x         2     0     0        1
#3 i     x         0     1     0        2
#4 i     x         3     0     1        0
#5 i     x         3     1     1        1

You may remove the group column from output using %>% select(-group).

data

df <- structure(list(inv = c("i", "i", "i", "i", "i"), ass = c("x", 
"x", "x", "x", "x"), port = c(2L, 2L, 0L, 3L, 3L), G = c(1L, 
0L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -5L))

reset cumulative sum based on another column

Here is an option, you first create the tp_cum column and then cumsum()

import pandas as pd
import numpy as np

df = pd.DataFrame([["y",10 ], 
["y",20  ],
["y",5  ],
["n",30   ],
["n",20   ],
["n",5 ],
["y",10  ], 
["y",40  ],
["y",15 ]],columns = ["type","sale"])

df["type2"] = np.cumsum((df["type"] != df["type"].shift(1)))
df["cum_sale"] = df[["sale","type2"]].groupby("type2").cumsum()
df

Output:

    type    sale    type2  cum_sale
0   y       10      1      10
1   y       20      1      30
2   y       5       1      35
3   n       30      2      30
4   n       20      2      50
5   n       5       2      55
6   y       10      3      10
7   y       40      3      50
8   y       15      3      65

Cumulative sum that resets when the condition is no longer met

Note: Uses global variable

c = 0
def fun(x):
    global c    
    if x['speed'] > 2.0:
        c = 0
    else:
        c = x['timedelta']+c
    return c

df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
    'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})

df['cum_sum']=df.apply(fun, axis=1)

            datetime    speed   timedelta   cum_sum
0   1-1-2019 19:30:00   0.5     0           0
1   1-1-2019 19:30:00   0.7     2           2
2   1-1-2019 19:30:00   0.1     2           4
3   1-1-2019 19:30:00   5.0     2           0
4   1-1-2019 19:30:00   25.0    2           0
5   1-1-2019 19:30:00   0.1     4           4
6   1-1-2019 19:30:00   0.1     7           11