Pandas Groupby Cumulative Sum

Pandas function to group by cumulative sum and return another column when a certain amount is reached

cumsum

You can perform the cumsum post group with:

df['amount_cumsum'] = df.groupby(['ID', 'item'])['amount'].cumsum()

Output (as separate column for clarity):

   ID  item  amount  level  amount_cumsum
0   1     1      10      5             10
1   1     1      10     10             20
2   2     4      15      5             15
3   2     9      30      8             30
4   2     4      10     10             25
5   3     4      10      4             10
6   3     4      10      6             20

dictionary

(df[df['amount_cumsum'].ge(20)]
 .groupby(['item'])['level'].agg(list)
 .to_dict()
 )

Output:

{1: [10], 4: [10, 6], 9: [8]}

Pandas: Cumulative sum within group with two conditions

You can use .where() on conditions x < 1 or x >= 1 to temporarily modify the values of value_1 to 0 according to the condition and then groupby cumsum, as follows:

The second condition is catered by the .groupby function while the first condition is catered by the .where() function, detailed below:

.where() keeps the column values when the condition is true and change the values (to 0 in this case) when the condition is false. Thus, for the first condition where column x < 1, value_1 will keep its values for feeding to the subsequent cumsum step to accumulate the filtered values of value_1. For rows where the condition x < 1 is False, value_1 has its values masked to 0. These 0 passed to cumsum for accumulation is effectively the same effect as taking out the original values of value_1 for the accumulation into
column cumsum_1.

The second line of codes accumulates value_1 values to column cumsum_2 with the opposite condition of x >= 1. These 2 lines of codes, in effect, allocate value_1 to cumsum_1 and cumsum_2 according to x < 1 and x >= 1, respectively.

(Thanks for the suggestion of @tdy to simplify the codes)

df['cumsum_1'] = df['value_1'].where(df['x'] < 1, 0).groupby(df['y']).cumsum()
df['cumsum_2'] = df['value_1'].where(df['x'] >= 1, 0).groupby(df['y']).cumsum()

Result:

print(df)

      x  y  value_1  cumsum_1  cumsum_2
0  0.10  1       12        12         0
1  1.20  1       10        12        10
2  0.25  1        7        19        10
3  1.00  2        3         0         3
4  0.72  2        5         5         3
5  1.50  2       10         5        13

How to calculate cumulative sum (reversed) of a Python DataFrame within given groups?

You can try with series groupby

df['new'] = df.loc[::-1, 'Chi'].groupby(df['Basin']).cumsum()
df
Out[858]: 
   Basin (n=17 columns)  Chi  new
0   13.0            ...    4   14
1   13.0            ...    8   10
2   13.0            ...    2    2
3   21.0            ...    4   10
4   21.0            ...    6    6
5   38.0            ...    1   14
6   38.0            ...    7   13
7   38.0            ...    2    6
8   38.0            ...    4    4