How to Select Id with Max Date Group by Category in Postgresql

How to select id with max date group by category in PostgreSQL?

This is a perfect use-case for DISTINCT ON - a Postgres specific extension of the standard DISTINCT:

SELECT DISTINCT ON (category)
id -- , category, date -- any other column (expression) from the same row
FROM tbl
ORDER BY category, date DESC;

Careful with descending sort order. If the column can be NULL, you may want to add NULLS LAST:

  • Sort by column ASC, but NULL values first?

DISTINCT ON is simple and fast. Detailed explanation in this related answer:

  • Select first row in each GROUP BY group?

For big tables with many rows per category consider an alternative approach:

  • Optimize GROUP BY query to retrieve latest row per user
  • Optimize groupwise maximum query

How do you display the max date per category with conditions in postgres


SELECT
surname
, MAX(CASE WHEN blue_eyes THEN date_of_birth END) AS blue_eyes
, MAX(CASE WHEN blonde_hair THEN date_of_birth END) AS blonde_hair
, MAX(CASE WHEN right_handed THEN date_of_birth END) AS right_handed
FROM
t_info
GROUP BY surname

select id with max date and keep all same max date SQL

From your example - you seem to want a query that gives you all the rows that match the max date in each category?

If so, you should group across the category (don't grab the ID from your t2). The subselect should give you the category and the maximum date, the outer correlated join will give you all the rows that match that category and date.

SELECT category,id,date
FROM order t1
INNER JOIN
(
SELECT category, MAX(date) as maxdate
FROM order
GROUP BY category
) t2
ON t1.category = t2.category
AND t1.date = t2.maxdate

Postgresql - min/max date range within group

This is a gaps-and-islands problem, where you want to group together adjacent rows that have the same product and status.

Here is an approach that uses the difference between row numbers to build the groups:

select product, status, min(start_date) start_date, max(end_date) end_date
from (
select t.*,
row_number() over(partition by product order by start_date) rn1,
row_number() over(partition by product, status order by start_date) rn2
from mytable t
) t
group by product, rn1 - rn2

Query Value by Max Date in Postgresql

You may try other approaches using row_number to generate a value to filter your data on the most recent data. You may then aggregate on customer id with the max value for a case expression filtering your records based on the desired row number rn=1 (we will order by descending) and item name.

These approaches are less verbose and based on the results online seem to be more performant. Let me know how replicating this in your environment works in the comments.

You may use EXPLAIN ANALYZE to compare this approach to the current one. The results in the online environment provided:

Current Approach

| Planning time: 0.129 ms                                                                                                      
| Execution time: 0.056 ms

Suggested Approach 1

| Planning time: 0.061 ms                                                                                                 
| Execution time: 0.070 ms

Suggested Approach 2

| Planning time: 0.047 ms                                                                                                 
| Execution time: 0.056 ms

NB. You may use EXPLAIN ANALYZE to compare these approaches in your environment which we cannot replicate online. The results may also vary on each run. Indexes and early filters on the item column are also recommended to improve performance.


Schema (PostgreSQL v9.5)

Suggested Approach 1

SELECT
t1.customer_id,
MAX(CASE WHEN t1.item='condition' THEN t1.value END) as conditio,
MAX(CASE WHEN t1.item='price' THEN t1.value END) as price,
MAX(CASE WHEN t1.item='feeling' THEN t1.value END) as feeling,
MAX(CASE WHEN t1.item='weather' THEN t1.value END) as weather
FROM (
SELECT
* ,
ROW_NUMBER() OVER (
PARTITION BY customer_id,item
ORDER BY tbl.timestamp DESC
) as rn
FROM
tbl
-- ensure that you filter based on your desired items
-- indexes on item column are recommended to improve performance
) t1
WHERE rn=1
GROUP BY
1;


































customer_idconditiopricefeelingweather
001ok1400fine
0021900finerain
003bad2000sadsunny

Group by max date and id

Something like this should work:

SELECT t1.id, t1.value, t1.date
FROM your_table t1
INNER JOIN (
SELECT id, MAX(date) date
FROM your_table
GROUP BY id
) t2
ON t1.id = t2.id AND t1.date = t2.date

select branch_id from report group by branch_id order by max(date) desc to Django Query

You should use proper Aggregation with values as documented so something in a line of

Report.objects.values('branch_id') 
.annotate(max_date= Max('date'))
.order_by('-max_date')


Related Topics



Leave a reply



Submit