Conditional Cumsum with Reset

cumsum with a condition to restart in R

You may use cumsum to create groups as well.

library(dplyr)

df <- df %>%
group_by(group = cumsum(dplyr::lag(port == 0, default = 0))) %>%
mutate(cumsum_G = cumsum(G)) %>%
ungroup

df

# inv ass port G group cumsum_G
# <chr> <chr> <int> <int> <dbl> <int>
#1 i x 2 1 0 1
#2 i x 2 0 0 1
#3 i x 0 1 0 2
#4 i x 3 0 1 0
#5 i x 3 1 1 1

You may remove the group column from output using %>% select(-group).

data

df <- structure(list(inv = c("i", "i", "i", "i", "i"), ass = c("x", 
"x", "x", "x", "x"), port = c(2L, 2L, 0L, 3L, 3L), G = c(1L,
0L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -5L))

Pandas Cumsum conditional reset


df = pd.DataFrame({'Size':[8,8,8,8,7,6,7,6,5,2]})

ls = []
cumsum = 0
last_reset = 0
for _, row in df.iterrows():
if cumsum + row.Size <= 16:
cumsum += row.Size
else:
last_reset = cumsum
cumsum = row.Size
ls.append(cumsum)

df['cumsum'] = ls

Result:

    Size    cumsum
0 8 8
1 8 16
2 8 8
3 8 16
4 7 7
5 6 13
6 7 7
7 6 13
8 5 5
9 2 7

Conditional cumsum and reset to 0

Create custom groups:

grps = df['comment'].str.contains(r'^manual input').cumsum()

df['cumulation'] = df.groupby(grps)['pre_cont_diff'].cumsum()

Output:

>>> df
comment count pre_cont_diff cumulation
0 auto 1 10 0.0 0.0
1 auto 2 30 20.0 20.0
2 auto 3 70 40.0 60.0
3 auto 4 120 50.0 110.0
4 auto 5 120 0.0 110.0
5 auto 6 130 10.0 120.0
6 auto 7 150 20.0 140.0
7 manual input 1 150 0.0 0.0
8 auto 8 200 50.0 50.0
9 auto 9 230 30.0 80.0
10 manual input 2 230 0.0 0.0

Details:

>>> pd.concat([df['comment'], grps], axis=1)
comment comment
0 auto 1 0
1 auto 2 0
2 auto 3 0
3 auto 4 0
4 auto 5 0
5 auto 6 0
6 auto 7 0
7 manual input 1 1
8 auto 8 1
9 auto 9 1
10 manual input 2 2

reset cumulative sum based on another column

Here is an option, you first create the tp_cum column and then cumsum()

import pandas as pd
import numpy as np

df = pd.DataFrame([["y",10 ],
["y",20 ],
["y",5 ],
["n",30 ],
["n",20 ],
["n",5 ],
["y",10 ],
["y",40 ],
["y",15 ]],columns = ["type","sale"])

df["type2"] = np.cumsum((df["type"] != df["type"].shift(1)))
df["cum_sale"] = df[["sale","type2"]].groupby("type2").cumsum()
df

Output:

    type    sale    type2  cum_sale
0 y 10 1 10
1 y 20 1 30
2 y 5 1 35
3 n 30 2 30
4 n 20 2 50
5 n 5 2 55
6 y 10 3 10
7 y 40 3 50
8 y 15 3 65

Cumulative sum that resets when the condition is no longer met

Note: Uses global variable

c = 0
def fun(x):
global c
if x['speed'] > 2.0:
c = 0
else:
c = x['timedelta']+c
return c

df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})

df['cum_sum']=df.apply(fun, axis=1)
            datetime    speed   timedelta   cum_sum
0 1-1-2019 19:30:00 0.5 0 0
1 1-1-2019 19:30:00 0.7 2 2
2 1-1-2019 19:30:00 0.1 2 4
3 1-1-2019 19:30:00 5.0 2 0
4 1-1-2019 19:30:00 25.0 2 0
5 1-1-2019 19:30:00 0.1 4 4
6 1-1-2019 19:30:00 0.1 7 11

Cumsum Reset based on a condition in Pandas

Use groupby.cumsum:

df['Cumulative'] = df.groupby('TransactionId')['Delta'].cumsum()

print (df)

TransactionId Delta Cumulative
0 14 2 2
1 14 3 5
2 14 1 6
3 14 2 8
4 15 4 4
5 15 2 6
6 15 3 9

Conditional cumsum with reset when accumulating and substracting at once

Here is one way you might do this:

ve$DO <- Reduce(function(x,y) pmax(x + y, 0), with(ve, V2-V3*(V1 >  0)), accumulate = TRUE)

ve
V1 V2 V3 DO
1 -2 6 0 6
2 4 0 5 1
3 3 0 8 0
4 -5 1 0 1
5 -5 0 0 1
6 -7 2 0 3
7 -8 3 0 6
8 -2 5 0 11
9 -3 7 0 18
10 -5 7 0 25
11 -7 8 0 33
12 -8 2 0 35
13 -9 0 0 35
14 -2 0 7 35
15 1 0 9 26
16 2 0 12 14
17 4 0 0 14

Equivalent using purrr/dplyr:

library(purrr)
library(dplyr)

ve %>%
mutate(DO = accumulate(V2-V3*(V1 > 0), .f = ~pmax(.x + .y, 0)))


Related Topics



Leave a reply



Submit