How to "Reset" Running Sum After It Reaches a Threshold

how to reset cumulative sum when reached to threshold

Because of the nature of this problem, you need to use a recursive CTE. This looks something like:

with t as (
select t.*, row_number() over (order by pk) as seqnum
from yourtable t
),
cte as (
select seqnum, pk. amount, amount as running_amount
from t
where seqnum = 1
union all
select t.seqnum, t.pk, t.amount,
(case when running_amount + amount > 300 then amount
else running_amount + amount
end)
from cte join
t
on t.seqnum = cte.seqnum + 1
)
select *
from cte;

The exact syntax for recursive CTEs varies, depending on the database, but they are part of standard SQL.

Reset cumulative sum column after threshold with groups

Capping a cumulative SUM by using standard SUM() OVER() is not possible due to threshold. One way to achieve such result is recursive CTE:

WITH cte_r AS (
SELECT t.*, ROW_NUMBER() OVER(PARTITION BY GroupNr ORDER BY (SELECT 1)) AS rn
FROM Table1 t
), cte AS (
SELECT GroupNr, Name, [Sum], [CumSum],
CAST([Sum] AS INT) AS ResetCumSum,
rn
FROM cte_r
WHERE rn = 1
UNION ALL
SELECT cte_r.GroupNr, cte_r.Name, cte_r.[Sum], cte_r.[CumSum],
CAST(CASE WHEN cte.ResetCumSum >= 330 THEN 0 ELSE cte.ResetCumSum END + cte_r.[Sum] AS INT)
AS ResetCumSum,
cte_r.rn
FROM cte
JOIN cte_r
ON cte.rn = cte_r.rn-1
AND cte.GroupNr = cte_r.GroupNr
)
SELECT GroupNr, Name, [Sum], [CumSum], ResetCumSum
FROM cte
ORDER BY GroupNr, rn;

Output:

Sample Image

db<>fiddle demo

Warning: Table by design is unordered set so to get stable result a order column is required(like unqiue id, timestamp). Here to emulate insert ROW_NUMBER() OVER(PARTITION BY GroupNr ORDER BY (SELECT 1)) AS rn was used but it is not stable.

Related:

Conditional SUM and the same using MATCH_RECOGNIZE - in my opinion the cleanest way



Extra:

Quirky UPDATE: Running Total until specific condition is true

Disclaimer: "DO NOT USE IT AT PRODUCTION!!!"

-- source table to be extended with id and Resetcumsum  columns
CREATE CLUSTERED INDEX IX_ROW_NUM ON Table1(GroupNr, id);

DECLARE @running_total NUMERIC(14,2) = 0
,@prev_running_total NUMERIC(14,2) = 0
,@prev_GroupNr INT = 0;

UPDATE Table1
SET
@prev_running_total = @running_total
,@running_total = Resetcumsum = IIF(@prev_GroupNr != GroupNr
OR @running_total >= 330, 0, @running_total)
+ [Sum]
,@prev_GroupNr = GroupNr
FROM Table1 WITH(INDEX(IX_ROW_NUM))
OPTION (MAXDOP 1);

SELECT *
FROM Table1
ORDER BY id;

db<>fiddle demo - 2

Reset rolling sum to 0 after reaching the threshold

Here is the way I managed to do it:

SELECT *,
SUM(case when month_disc=1 OR month_ticket=0 then 0 else value end) OVER (PARTITION BY account, flg_sum, band_sum ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum
FROM (
SELECT *,
FLOOR(SUM(case when month_disc=1 OR month_ticket=0 then 0 else value end) OVER (PARTITION BY account, flg_sum ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)/50.000001) as band_sum ---- create bands for running total
FROM (
SELECT *,
SUM(tag_flg) OVER (PARTITION BY account ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum
FROM (
SELECT *,
CASE WHEN (month_disc=1 OR month_ticket=0) THEN 1 ELSE 0 END AS tag_flg ---- flag to count when the value is reset due to one of the conditions
FROM source_table) x ) y) z

sum() until threshold value reached and summarize it as a single record and reset and continue the aggregation

Consider below approach

with recursive temp as (
select *, row_number() over(partition by id order by from_range) pos
from your_table
), result as (
select *, total_amount as total, true as new_group
from temp where pos = 1
union all
select t.*,
if(total + t.total_amount > 10000000, t.total_amount, total + t.total_amount),
if(total + t.total_amount > 10000000, true, false)
from temp t join result r
on t.pos = r.pos + 1 and t.id = r.id
)
select id,
min(from_range) from_range,
max(to_range) to_range,
max(total) as total_amount
from (
select *, countif(new_group) over(partition by id order by pos) grp
from result
)
group by id, grp

if applied to sample data in your question - output is

Sample Image

Resetting Cumulative Sum once a value is reached and set a flag to 1

"Ordinary" cumsum() is here useless, as this function "doesn't know"
where to restart summation.

You can do it with the following custom function:

def myCumSum(x, thr):
if myCumSum.prev >= thr:
myCumSum.prev = 0
myCumSum.prev += x
return myCumSum.prev

This function is "with memory" (from the previous call) - prev, so there
is a way to "know" where to restart.

To speed up the execution, define a vectorized version of this function:

myCumSumV = np.vectorize(myCumSum, otypes=[np.int], excluded=['thr'])

Then execute:

threshold = 40
myCumSum.prev = 0 # Set the "previous" value
# Replace "a" column with your cumulative sum
df.a = myCumSumV(df.a.values, threshold)
df['flag'] = df.a.ge(threshold).astype(int) # Compute "flag" column

The result is:

     a  b  flag
0 5 1 0
1 11 1 0
2 41 1 1
3 170 0 1
4 5 1 0
5 15 1 0


Related Topics



Leave a reply



Submit