cumsum with a condition to restart in R
You may use cumsum
to create groups as well.
library(dplyr)
df <- df %>%
group_by(group = cumsum(dplyr::lag(port == 0, default = 0))) %>%
mutate(cumsum_G = cumsum(G)) %>%
ungroup
df
# inv ass port G group cumsum_G
# <chr> <chr> <int> <int> <dbl> <int>
#1 i x 2 1 0 1
#2 i x 2 0 0 1
#3 i x 0 1 0 2
#4 i x 3 0 1 0
#5 i x 3 1 1 1
You may remove the group
column from output using %>% select(-group)
.
data
df <- structure(list(inv = c("i", "i", "i", "i", "i"), ass = c("x",
"x", "x", "x", "x"), port = c(2L, 2L, 0L, 3L, 3L), G = c(1L,
0L, 1L, 0L, 1L)), class = "data.frame", row.names = c(NA, -5L))
Pandas Cumsum conditional reset
df = pd.DataFrame({'Size':[8,8,8,8,7,6,7,6,5,2]})
ls = []
cumsum = 0
last_reset = 0
for _, row in df.iterrows():
if cumsum + row.Size <= 16:
cumsum += row.Size
else:
last_reset = cumsum
cumsum = row.Size
ls.append(cumsum)
df['cumsum'] = ls
Result:
Size cumsum
0 8 8
1 8 16
2 8 8
3 8 16
4 7 7
5 6 13
6 7 7
7 6 13
8 5 5
9 2 7
Conditional cumsum and reset to 0
Create custom groups:
grps = df['comment'].str.contains(r'^manual input').cumsum()
df['cumulation'] = df.groupby(grps)['pre_cont_diff'].cumsum()
Output:
>>> df
comment count pre_cont_diff cumulation
0 auto 1 10 0.0 0.0
1 auto 2 30 20.0 20.0
2 auto 3 70 40.0 60.0
3 auto 4 120 50.0 110.0
4 auto 5 120 0.0 110.0
5 auto 6 130 10.0 120.0
6 auto 7 150 20.0 140.0
7 manual input 1 150 0.0 0.0
8 auto 8 200 50.0 50.0
9 auto 9 230 30.0 80.0
10 manual input 2 230 0.0 0.0
Details:
>>> pd.concat([df['comment'], grps], axis=1)
comment comment
0 auto 1 0
1 auto 2 0
2 auto 3 0
3 auto 4 0
4 auto 5 0
5 auto 6 0
6 auto 7 0
7 manual input 1 1
8 auto 8 1
9 auto 9 1
10 manual input 2 2
reset cumulative sum based on another column
Here is an option, you first create the tp_cum
column and then cumsum()
import pandas as pd
import numpy as np
df = pd.DataFrame([["y",10 ],
["y",20 ],
["y",5 ],
["n",30 ],
["n",20 ],
["n",5 ],
["y",10 ],
["y",40 ],
["y",15 ]],columns = ["type","sale"])
df["type2"] = np.cumsum((df["type"] != df["type"].shift(1)))
df["cum_sale"] = df[["sale","type2"]].groupby("type2").cumsum()
df
Output:
type sale type2 cum_sale
0 y 10 1 10
1 y 20 1 30
2 y 5 1 35
3 n 30 2 30
4 n 20 2 50
5 n 5 2 55
6 y 10 3 10
7 y 40 3 50
8 y 15 3 65
Cumulative sum that resets when the condition is no longer met
Note: Uses global variable
c = 0
def fun(x):
global c
if x['speed'] > 2.0:
c = 0
else:
c = x['timedelta']+c
return c
df = pd.DataFrame( {'datetime': ['1-1-2019 19:30:00']*7,
'speed': [0.5,.7,0.1,5.0,25.0,0.1,0.1], 'timedelta': [0,2,2,2,2,4,7]})
df['cum_sum']=df.apply(fun, axis=1)
datetime speed timedelta cum_sum
0 1-1-2019 19:30:00 0.5 0 0
1 1-1-2019 19:30:00 0.7 2 2
2 1-1-2019 19:30:00 0.1 2 4
3 1-1-2019 19:30:00 5.0 2 0
4 1-1-2019 19:30:00 25.0 2 0
5 1-1-2019 19:30:00 0.1 4 4
6 1-1-2019 19:30:00 0.1 7 11
Cumsum Reset based on a condition in Pandas
Use groupby.cumsum
:
df['Cumulative'] = df.groupby('TransactionId')['Delta'].cumsum()
print (df)
TransactionId Delta Cumulative
0 14 2 2
1 14 3 5
2 14 1 6
3 14 2 8
4 15 4 4
5 15 2 6
6 15 3 9
Conditional cumsum with reset when accumulating and substracting at once
Here is one way you might do this:
ve$DO <- Reduce(function(x,y) pmax(x + y, 0), with(ve, V2-V3*(V1 > 0)), accumulate = TRUE)
ve
V1 V2 V3 DO
1 -2 6 0 6
2 4 0 5 1
3 3 0 8 0
4 -5 1 0 1
5 -5 0 0 1
6 -7 2 0 3
7 -8 3 0 6
8 -2 5 0 11
9 -3 7 0 18
10 -5 7 0 25
11 -7 8 0 33
12 -8 2 0 35
13 -9 0 0 35
14 -2 0 7 35
15 1 0 9 26
16 2 0 12 14
17 4 0 0 14
Equivalent using purrr/dplyr
:
library(purrr)
library(dplyr)
ve %>%
mutate(DO = accumulate(V2-V3*(V1 > 0), .f = ~pmax(.x + .y, 0)))
Related Topics
Similarity Scores Based on String Comparison in R (Edit Distance)
Percentage on Y Lab in a Faceted Ggplot Barchart
Command to See 'R' Path That Rstudio Is Using
Alternative to R's 'Memory.Size()' in Linux
Remove All Line Breaks (Enter Symbols) from the String Using R
Fill Region Between Two Loess-Smoothed Lines in R with Ggplot
Include Space for Missing Factor Level Used in Fill Aesthetics in Geom_Boxplot
No Rtools Compatible with R Version 3.5.0 Was Found
Is There a Vectorized Parallel Max() and Min()
How to Convert Data.Frame to Transactions for Arules
Adding Greek Character to Axis Title
Can Dcast Be Used Without an Aggregate Function
Extract Last Word in String in R
How to Plot a Stacked and Grouped Bar Chart in Ggplot
Plotting a 3D Surface Plot with Contour Map Overlay, Using R
Differencebetween [ ] and [[ ]] in R
Elegantly Assigning Multiple Columns in Data.Table with Lapply()