Calculating Cumulative Sum in Postgresql

Calculating Cumulative Sum in PostgreSQL

Basically, you need a window function. That's a standard feature nowadays. In addition to genuine window functions, you can use any aggregate function as window function in Postgres by appending an OVER clause.

The special difficulty here is to get partitions and sort order right:

SELECT ea_month, id, amount, ea_year, circle_id
, sum(amount) OVER (PARTITION BY circle_id
ORDER BY ea_year, ea_month) AS cum_amt
FROM tbl
ORDER BY circle_id, month;

And no GROUP BY.

The sum for each row is calculated from the first row in the partition to the current row - or quoting the manual to be precise:

The default framing option is RANGE UNBOUNDED PRECEDING, which is
the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. With
ORDER BY, this sets the frame to be all rows from the partition
start up through the current row's last ORDER BY peer
.

... which is the cumulative or running sum you are after. Bold emphasis mine.

Rows with the same (circle_id, ea_year, ea_month) are "peers" in this query. All of those show the same running sum with all peers added to the sum. But I assume your table is UNIQUE on (circle_id, ea_year, ea_month), then the sort order is deterministic and no row has peers.

Postgres 11 added tools to include / exclude peers with the new frame_exclusion options. See:

  • Aggregating all values not in the same group

Now, ORDER BY ... ea_month won't work with strings for month names. Postgres would sort alphabetically according to the locale setting.

If you have actual date values stored in your table you can sort properly. If not, I suggest to replace ea_year and ea_month with a single column mon of type date in your table.

  • Transform what you have with to_date():

      to_date(ea_year || ea_month , 'YYYYMonth') AS mon
  • For display, you can get original strings with to_char():

      to_char(mon, 'Month') AS ea_month
    to_char(mon, 'YYYY') AS ea_year

While stuck with the unfortunate design, this will work:

SELECT ea_month, id, amount, ea_year, circle_id
, sum(amount) OVER (PARTITION BY circle_id ORDER BY mon) AS cum_amt
FROM (SELECT *, to_date(ea_year || ea_month, 'YYYYMonth') AS mon FROM tbl)
ORDER BY circle_id, mon;

SQL - cumulative sum in postgres

You can use window functions and subtraction:

select t.*,
( initial_amount +
sum(amount) over (partition by item order by date_of_purchase)
) as cumulative
from t;

Postgresql - Cumulative sum of created users

I'd take a two-step approach. First, use an inner query to count how many users were created each month. Then, wrap this query with another query that calculates the cumulative sum of these counts:

SELECT   created_at, SUM(cnt) OVER (ORDER BY created_at ASC)
FROM (SELECT TO_CHAR(created_at, 'YYYY-MM') AS created_at, COUNT(*) AS cnt
FROM users
GROUP BY TO_CHAR(created_at, 'YYYY-MM')) t
ORDER BY 1 ASC;

Calculating the cumulative sum with a specific 'date' merged to single column in PostgreSQL

Looks like your sum() over() computes the wrong amount, try

 Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",

EDIT
If you want to format output (cumulative amount only once per date), use row_number() to detect first row in a group. Make sure over() clause is in sync with ORDER BY of the query.

SELECT 
id,
first_name,
last_name,
birth_date,
card,
datetime,
amount,
case when rn=1 then "Daily Amount" end "Daily Amount"
FROM (
SELECT
s.id,
s.first_name,
s.last_name,
s.birth_date,
s.card,
p.datetime,
p.amount,
Sum(p.amount) OVER(partition BY s.id, p.datetime) AS "Daily Amount",
row_number() OVER(partition BY s.id, p.datetime ORDER BY p.amount) AS rn
FROM payments AS p
LEFT JOIN users AS s ON p.s_h_uuid = s.h_uuid
) t
ORDER BY datetime DESC, id, amount

Cumulative sum which is grouped by

(Assuming the office_id were supposed to be all 1s)

You need two levels of nesting, one to do the grouping sum and one the cumulation sum:

select office_id, ts, sum(sum) over (partition by office_id order by ts) from (
SELECT office_id,
date_trunc('day', time) as ts,
sum(revenue) FROM revenue
GROUP BY 1, 2
) foo
ORDER BY office_id, ts;


Related Topics



Leave a reply



Submit