Group Query Results by Month and Year in Postgresql

Group query results by month and year in postgresql

select to_char(date,'Mon') as mon,
extract(year from date) as yyyy,
sum("Sales") as "Sales"
from yourtable
group by 1,2

At the request of Radu, I will explain that query:

to_char(date,'Mon') as mon, : converts the "date" attribute into the defined format of the short form of month.

extract(year from date) as yyyy : Postgresql's "extract" function is used to extract the YYYY year from the "date" attribute.

sum("Sales") as "Sales" : The SUM() function adds up all the "Sales" values, and supplies a case-sensitive alias, with the case sensitivity maintained by using double-quotes.

group by 1,2 : The GROUP BY function must contain all columns from the SELECT list that are not part of the aggregate (aka, all columns not inside SUM/AVG/MIN/MAX etc functions). This tells the query that the SUM() should be applied for each unique combination of columns, which in this case are the month and year columns. The "1,2" part is a shorthand instead of using the column aliases, though it is probably best to use the full "to_char(...)" and "extract(...)" expressions for readability.

PostgreSQL group data by month AND type

Use conditional aggregation:

select date_trunc('month', date) as mon,
sum((item_type = apple)::int) as apple,
sum((item_type = banana)::int) as banana
from t
group by mon
order by mon;

Group query results by month and year in postgresql with emply month sum

Left join to a list of months:

SELECT t.txn_month, 
coalesce(sum(yt.amount),0) as monthly_sum
FROM generate_series(date '2019-02-01', date '2019-04-01', interval '1 month') as t(txn_month)
left join yourtable yt on date_trunc('month', yt.transdate) = t.txn_month
GROUP BY t.txn_month

Online example


In your actual query you need to move the conditions from the WHERE clause to the JOIN condition. Putting them into the WHERE clause turns the outer join back into an inner join:

SELECT t."ItemId",
y."transactionDate" AS txn_month,
-coalesce(SUM(t."transactionQty"),0) AS "TotalSold"
FROM generate_series(date '2018-01-01', date '2020-04-01', INTERVAL '1 month') as y("transactionDate")
LEFT JOIN "ItemTransactions" AS t
ON date_trunc('month', t."transactionDate") = y."transactionDate"
AND t."ItemTransactionTypeId" = 1
AND t."ItemId" = 10606
-- this WHERE clause isn't really needed because of the date values provided to generate_series()
WHERE AND y."transactionDate" >= NOW() - INTERVAL '2 year'
GROUP BY txn_month, t."ItemId"
ORDER BY txn_month DESC;

Sql group query results by user id and date ranges dynamically

With all weeks starting on Monday, this would do it (efficiently):

SELECT id AS user_id, u."onboardedAt", u."closedAt"
, week_start, COALESCE(t.tx_count, 0) AS tx_count, a.last_user_action
FROM "Users" u
CROSS JOIN generate_series(date_trunc('week', u."onboardedAt"), u."closedAt", interval '1 week') AS week_start
LEFT JOIN (
SELECT "userId" AS id, date_trunc('week', t."createdAt") AS week_start, count(*) AS tx_count
FROM "Transactions" t
GROUP BY 1, 2
) t USING (id, week_start)
LEFT JOIN (
SELECT DISTINCT ON (1, 2)
"userId" AS id, date_trunc('week', a."createdAt") AS week_start, action AS last_user_action
FROM "UserActions" a
ORDER BY 1, 2, "createdAt" DESC
) a USING (id, week_start)
ORDER BY id, week_start;

db<>fiddle here

Working with standard weeks makes everything much simpler. We can aggregate in the "many" tables before joining, which is simpler and cheaper. Else, multiple joins can go wrong quickly. See:

  • Two SQL LEFT JOINS produce incorrect result

Standard weeks make it easier to compare data, too. (Note that first and last week per user can be truncated (span fewer days). But that applies to the last week per user in any case.)

The LATERAL keyword is assumed automatically in a join to a set-returning function:

CROSS  JOIN  generate_series(...)

See:

  • What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

Using DISTINCT ON to get the last_user_action per user. See:

  • Select first row in each GROUP BY group?

I advise to user legal, lower-case identifiers, so double-quoting is not required. Makes your life with Postgres easier.

Use last non-null action

Added in a comment:

if action is null in a current week, I want to grab most recent from previous weeks

SELECT user_id, "onboardedAt", "closedAt", week_start, tx_count
, last_user_action AS last_user_action_with_null
, COALESCE(last_user_action, max(last_user_action) OVER (PARTITION BY user_id, null_grp)) AS last_user_action
FROM (
SELECT id AS user_id, u."onboardedAt", u."closedAt"
, week_start, COALESCE(t.tx_count, 0) AS tx_count, a.last_user_action
, count(a.last_user_action) OVER (PARTITION BY id ORDER BY week_start) AS null_grp
FROM "Users" u
CROSS JOIN generate_series(date_trunc('week', u."onboardedAt"), u."closedAt", interval '1 week') AS week_start
LEFT JOIN (
SELECT "userId" AS id, date_trunc('week', t."createdAt") AS week_start, count(*) AS tx_count
FROM "Transactions" t
GROUP BY 1, 2
) t USING (id, week_start)
LEFT JOIN (
SELECT DISTINCT ON (1, 2)
"userId" AS id, date_trunc('week', a."createdAt") AS week_start, action AS last_user_action
FROM "UserActions" a
ORDER BY 1, 2, "createdAt" DESC
) a USING (id, week_start)
) sub
ORDER BY user_id, week_start;

db<>fiddle here

Explanation:

  • Retrieve last known value for each column of a row

PostgreSQL Query GROUP BY Year/Month - No function matches the given name?

In Postgres, use date_trunc():

SELECT date_trunc('month', u.created_at) as yyyymm, COUNT(*)
FROM users u
WHERE u.created_at IS NOT NULL
GROUP BY yyyymm
ORDER BY yyyymm;

Query a dataset applying filter on month and year in PostgreSQL

Try this:

SELECT * FROM my_table 
WHERE EXTRACT('month' from the_debt_paid) = 6
and EXTRACT('year' from the_debt_paid)=2019

You can write your query like this also:

SELECT * FROM my_table 
WHERE (EXTRACT(month from the_debt_paid), EXTRACT('year' from the_debt_paid))=(6,2019)

you can compare month and year by using extract
DEMO



Related Topics



Leave a reply



Submit