How to Select a Max Row for Each Group in SQL

SQL to get max value from each group

select * from [table] t1
inner join
(
select track_id, user_id, max(rating) maxRating
from [table]
group by track_id, user_id
) tmp
on t1.track_id = tmp.track_id
and t1.user_id = tmp.user_id
and t1.rating = tmp.maxRating;

Get records with max value for each group of grouped SQL results

There's a super-simple way to do this in mysql:

select * 
from (select * from mytable order by `Group`, age desc, Person) x
group by `Group`

This works because in mysql you're allowed to not aggregate non-group-by columns, in which case mysql just returns the first row. The solution is to first order the data such that for each group the row you want is first, then group by the columns you want the value for.

You avoid complicated subqueries that try to find the max() etc, and also the problems of returning multiple rows when there are more than one with the same maximum value (as the other answers would do)

Note: This is a mysql-only solution. All other databases I know will throw an SQL syntax error with the message "non aggregated columns are not listed in the group by clause" or similar. Because this solution uses undocumented behavior, the more cautious may want to include a test to assert that it remains working should a future version of MySQL change this behavior.

Version 5.7 update:

Since version 5.7, the sql-mode setting includes ONLY_FULL_GROUP_BY by default, so to make this work you must not have this option (edit the option file for the server to remove this setting).

SQL Query to select each row with max value per group

Use a subquery to get the max runid for each environmentid from the runtable. Join the obtained result to the issuetable and select the required columns.

select i.id, i.runid, i.value, r.environmentid
from (select environmentid, max(runid) maxrunid
from runtable
group by environmentid) r
join issuetable i on i.runid = r.maxrunid
order by i.runid, i.id

Oracle SQL get row with MAX value for each group in a set of grouped results

You can use CTEs

WITH CTE0 AS 
(
SELECT
REV_USAGE_DATA.DDATE,
REV_USAGE_DATA.SEGMENT,
COUNT(*) AS Freq
FROM CADA_PERMSISDN_DASH REV_USAGE_DATA
GROUP BY
REV_USAGE_DATA.DDATE,
REV_USAGE_DATA.SEGMENT
)
SELECT
DDATE,
SEGMENT,
FREQ
FROM CTE0
WHERE (DDATE, SEGMENT, FREQ) IN (
SELECT DDATE, MAX(SEGMENT), MAX(FREQ)
FROM CTE0
GROUP BY DDATE
)

SQL select only rows with max value on a column

At first glance...

All you need is a GROUP BY clause with the MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

It's never that simple, is it?

I just noticed you need the content column as well.

This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.

It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.

Basically, you have two approaches to solve that problem:

Joining with simple group-identifier, max-value-in-group Sub-query

In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left Joining with self, tweaking join conditions and filters

In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:

  1. The second join condition is having left side value less than right value
  2. When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

Conclusion

Both approaches bring the exact same result.

If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.

Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".

Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

Return the row with max value for each group

A perfect use case for DISTINCT ON:

SELECT DISTINCT ON (realm, race) *
FROM tbl
ORDER BY realm, race, total DESC;

db<>fiddle here

Notably, the query has no GROUP BY at all.

Assuming total is NOT NULL, else append NULLS LAST.

In case of a tie, the winner is arbitrary unless you add more ORDER BY items to break the tie.

Detailed explanation:

  • Select first row in each GROUP BY group?


Related Topics



Leave a reply



Submit