Cumsum reset at NaN
A simple Numpy translation of your Matlab code is this:
import numpy as np
v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
np.cumsum(v)
Executing this code returns the result array([ 1., 2., 3., 0., 1., 2., 3., 4., 0., 1.])
. This solution will only be as valid as the original one, but maybe it will help you come up with something better if it isn't sufficient for your purposes.
Pandas dataframe, cumsum reset on NAN
Use groupby
and cumsum
:
df['s_cumsum'] = df.s_number.groupby(df.s_number.isna().cumsum()).cumsum()
df
Index s_number s_cumsum
0 0 1.0 1.0
1 1 4.0 5.0
2 2 6.0 11.0
3 3 NaN NaN
4 4 7.0 7.0
5 5 2.0 9.0
6 6 3.0 12.0
Note that if "s_number" is a column of strings, use
df['s_number'] = pd.to_numeric(df['s_number'], errors='coerce)
...first, to get a float column with NaNs.
If you want to fill the NaNs,
df['s_cumsum'] = (df.s_number.groupby(df.s_number.isna().cumsum())
.cumsum()
.fillna(0, downcast='infer'))
df
Index s_number s_cumsum
0 0 1.0 1
1 1 4.0 5
2 2 6.0 11
3 3 NaN 0
4 4 7.0 7
5 5 2.0 9
6 6 3.0 12
Matlab cumsum reset at NaN?
I can only think of a few-pass solution:
v = [1 1 1 NaN 1 1 1 1 NaN 1];
a = v==v; %% convert the values first to [1 1 1 0 1 1 1 1 0 1] format
n = a==0; %% positions of the NaNs
c = cumsum(a); %% your intermediate result
d = diff([0 c(n)]); %% runs of ones
v(n) = -d; %% replace Nans by -3, -4 [1 1 1 -3 1 1 1 1 -4 1]
cumsum(v) %% the answer [1 2 3 0 1 2 3 4 0 1]
Note: haven't checked extreme conditions (NaN
in first/Last position, consecutive NaN
s etc.)
How to reset cumulative sum per group when a certain column is 0 in pandas
For the given resetting
condition
, usegroupby.cumsum
to create aReset
grouper that tells us whenQuantity
hits 0 within eachGroup
:condition = df.Quantity.eq(0)
df['Reset'] = condition.groupby(df.Group).cumsum()
# Group Quantity Value Cumulative_sum Reset
# 0 A 10 200 200 0
# 1 B 5 300 300 0
# 2 A 1 50 250 0
# 3 A 0 100 0 1
# 4 C 5 400 400 0
# 5 A 10 300 300 1
# 6 B 10 200 500 0
# 7 A 15 350 650 1mask
theValue
column whenever the resettingcondition
is met and use anothergroupby.cumsum
on bothGroup
andReset
:df['Cumul'] = df.Value.mask(condition, 0).groupby([df.Group, df.Reset]).cumsum()
# Group Quantity Value Cumulative_sum Reset Cumul
# 0 A 10 200 200 0 200
# 1 B 5 300 300 0 300
# 2 A 1 50 250 0 250
# 3 A 0 100 0 1 0
# 4 C 5 400 400 0 400
# 5 A 10 300 300 1 300
# 6 B 10 200 500 0 500
# 7 A 15 350 650 1 650
Cumsum from DateTime that reset at specific times
Try groupby().cumcount()
on the cumsum
:
# blocks starting with `14:30:00`
# print to see the blocks
blocks = df.Time.eq('14:30:00').cumsum()
# enumerate the rows within each block with `groupby`
df['count_1430'] = df.groupby(blocks).cumcount()
Output:
Date Time Open High Low Last count_1430
0 28/05/2018 14:30:00 1.16167 1.16252 1.16130 1.16166 0
1 28/05/2018 15:00:00 1.16166 1.16287 1.16159 1.16276 1
2 28/05/2018 15:30:00 1.16277 1.16293 1.16177 1.16212 2
3 28/05/2018 16:00:00 1.16213 1.16318 1.16198 1.16262 3
4 28/05/2018 16:30:00 1.16262 1.16298 1.16258 1.16284 4
5 28/05/2018 17:00:00 1.16285 1.16329 1.16264 1.16265 5
6 28/05/2018 17:30:00 1.16266 1.16300 1.16243 1.16289 6
7 28/05/2018 18:00:00 1.16288 1.16290 1.16228 1.16269 7
8 28/05/2018 18:30:00 1.16269 1.16278 1.16264 1.16274 8
9 28/05/2018 19:00:00 1.16275 1.16277 1.16270 1.16275 9
10 28/05/2018 19:30:00 1.16276 1.16284 1.16270 1.16280 10
11 28/05/2018 20:00:00 1.16279 1.16288 1.16264 1.16278 11
12 28/05/2018 20:30:00 1.16278 1.16289 1.16260 1.16265 12
13 28/05/2018 21:00:00 1.16267 1.16270 1.16251 1.16262 13
14 29/05/2018 14:30:00 1.15793 1.15827 1.15714 1.15786 0
15 29/05/2018 15:00:00 1.15785 1.15900 1.15741 1.15814 1
16 29/05/2018 15:30:00 1.15813 1.15813 1.15601 1.15647 2
17 29/05/2018 16:00:00 1.15647 1.15658 1.15451 1.15539 3
18 29/05/2018 16:30:00 1.15539 1.15601 1.15418 1.15510 4
19 29/05/2018 17:00:00 1.15508 1.15599 1.15463 1.15527 5
20 29/05/2018 17:30:00 1.15528 1.15587 1.15442 1.15465 6
21 29/05/2018 18:00:00 1.15465 1.15469 1.15196 1.15261 7
22 29/05/2018 18:30:00 1.15261 1.15441 1.15261 1.15349 8
23 29/05/2018 19:00:00 1.15348 1.15399 1.15262 1.15399 9
24 29/05/2018 19:30:00 1.15400 1.15412 1.15239 1.15322 10
25 29/05/2018 20:00:00 1.15322 1.15373 1.15262 1.15367 11
26 29/05/2018 20:30:00 1.15367 1.15419 1.15351 1.15367 12
27 29/05/2018 21:00:00 1.15366 1.15438 1.15352 1.15354 13
28 29/05/2018 21:30:00 1.15355 1.15355 1.15354 1.15354 14
29 30/05/2018 14:30:00 1.16235 1.16323 1.16133 1.16161 0
30 30/05/2018 15:00:00 1.16162 1.16193 1.16020 1.16059 1
Python pandas cumsum with reset everytime there is a 0
You can use:
a = df != 0
df1 = a.cumsum()-a.cumsum().where(~a).ffill().fillna(0).astype(int)
print (df1)
a b
0 0 1
1 1 2
2 0 3
3 1 0
4 2 1
5 0 2
How to reset cumsum after change in sign of values?
Create new key to groupby
, then do cumsum
within each group
New key Create: By using the sign change , if change we add one then it will belong to nest group
df.groupby(df.data.lt(0).astype(int).diff().ne(0).cumsum()).data.cumsum()
Out[798]:
0 -2
1 -3
2 1
3 -3
4 -4
5 2
6 2
7 5
8 -1
9 -3
Name: data, dtype: int64
pandas fillna in column with cumsum of previous rows (reset after every nan)
Use GroupBy.cumsum
with helper Series created by check missing value by another cumsum
:
df['sum'] = df.groupby(df['points'].isna().cumsum())['points'].cumsum()
print (df)
team points sum
0 GB 43.76 43.76
1 TEN 17.30 61.06
2 ARI 0.20 61.26
3 ATL 12.30 73.56
4 HOU 21.10 94.66
5 ARI 1.70 96.36
6 ATL 12.60 108.96
7 SF 15.00 123.96
8 GB 5.70 129.66
9 1 NaN NaN
10 GB 43.76 43.76
11 TEN 17.30 61.06
12 ARI 0.20 61.26
13 ATL 12.30 73.56
14 HOU 21.10 94.66
15 ARI 1.70 96.36
16 ATL 12.60 108.96
17 BUF 7.00 115.96
18 GB 5.70 121.66
19 2 NaN NaN
Related Topics
Truncate to Three Decimals in Python
Why Can't Environmental Variables Set in Python Persist
Pygame How to Let Balls Collide
Enable Python to Connect to MySQL via Ssh Tunnelling
Python 2.X - Write Binary Output to Stdout
Access Memory Address in Python
How to Print Out Status Bar and Percentage
Builtin Function Not Working with Spyder
How to Convert a Python Datetime.Datetime to Excel Serial Date Number
Compare Two CSV Files and Search for Similar Items
Python - How to Convert JSON File to Dataframe
How to Initialize Weights in Pytorch