SQL - How to Select a Row Having a Column with Max Value

SQL select only rows with max value on a column


At first glance...

All you need is a GROUP BY clause with the MAX aggregate function:

SELECT id, MAX(rev)
FROM YourTable
GROUP BY id

It's never that simple, is it?

I just noticed you need the content column as well.

This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.

It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.

Basically, you have two approaches to solve that problem:

Joining with simple group-identifier, max-value-in-group Sub-query

In this approach, you first find the group-identifier, max-value-in-group (already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier and max-value-in-group:

SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev

Left Joining with self, tweaking join conditions and filters

In this approach, you left join the table with itself. Equality goes in the group-identifier. Then, 2 smart moves:

  1. The second join condition is having left side value less than right value
  2. When you do step 1, the row(s) that actually have the max value will have NULL in the right side (it's a LEFT JOIN, remember?). Then, we filter the joined result, showing only the rows where the right side is NULL.

So you end up with:

SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;

Conclusion

Both approaches bring the exact same result.

If you have two rows with max-value-in-group for group-identifier, both rows will be in the result in both approaches.

Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".

Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.

How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?

You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten table on BOTH fields:

SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime

MySQL - How to select rows with max value of a field

If you want to get ties, then you can do something like this:

select s.*
from scores s
where s.score = (select max(s2.score) from scores s2 where s2.level = s.level);

You could get one row per level by aggregating this:

select s.level, s.score, group_concat(s.user_id)
from scores s
where s.score = (select max(s2.score) from scores s2 where s2.level = s.level)
group by s.level, s.score;

This combines the users (if there is more than one) into a single field.

SQL : Keep ONE row with max value on a column depending on value of another column

One method is rank() or row_number();

select t.*
from (select t.*,
row_number() over (partition by id order by col1 desc, col2 asc) as seqnum
from t
) t
where seqnum = 1;

You would use rank() if you want multiple rows when there are duplicate max col1/ min col2 for the same id.

how to get a Row with Max value of a column?

You can use a correlated subquery:

select t.*
from mytable t
where t.srno = (select max(srno) from mytable t1 where t1.p_id = t.p_id)

With an index on (p_id, srno), this should be an efficient solution.

Anoter common solution is to use row_number():

select pid, name, srno, rate
from (
select t.*, row_number() over(partition by p_id order by srno desc) rn
from mytable t
) t
where rn = 1

Selecting a Record With MAX Value

Here's an option if you have multiple records for each Customer and are looking for the latest balance for each (say they are dated records):

SELECT ID, BALANCE FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DateModified DESC) as RowNum, ID, BALANCE
FROM CUSTOMERS
) C
WHERE RowNum = 1

Select row with max value with having clause

You can use TOP and ORDER BY for this:

SELECT TOP 1
d.ID AS dep_id,
sum(u.Salary) AS Sum_Salary
from dbo.users u
INNER JOIN Departments d ON u.DepartmentID=d.id
GROUP BY d.ID
order by Sum_Salary desc;

It'll return the top 1 row with maximum Sum_salary.

If you just want to find maximum sum_salary, use MAX:

SELECT 
MAX(s.Sum_Salary)
FROM
(SELECT
SUM(u.Salary) AS Sum_Salary
FROM
dbo.users u
INNER JOIN
Departments d ON u.DepartmentID = d.id
GROUP BY
d.ID) s

SQL Query to select each row with max value per group

Use a subquery to get the max runid for each environmentid from the runtable. Join the obtained result to the issuetable and select the required columns.

select i.id, i.runid, i.value, r.environmentid
from (select environmentid, max(runid) maxrunid
from runtable
group by environmentid) r
join issuetable i on i.runid = r.maxrunid
order by i.runid, i.id

Select rows with Max(Column Value) for each unique combination of two other columns

In MySQL 5.x you can use a sub-query.

SELECT * 
FROM your_table
WHERE (`Group`, Dataset, RunNumber) IN (
SELECT `Group`, Dataset, MAX(RunNumber) AS MaxRunNumber
FROM your_table
GROUP BY `Group`, Dataset
);

Test on db<>fiddle here

Alternatives

--
-- LEFT JOIN on bigger
--
SELECT t.*
FROM your_table t
LEFT JOIN your_table t2
ON t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
WHERE t2.RunNumber IS NULL
ORDER BY t.`Group`, t.Dataset;

--
-- where NOT EXISTS on bigger
--
SELECT *
FROM your_table t
WHERE NOT EXISTS (
SELECT 1
FROM your_table t2
WHERE t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
)
ORDER BY `Group`, Dataset;

--
-- Emulating DENSE_RANK = 1 with variables
-- Works also in 5.x
--
SELECT RunNumber, `Group`, Dataset, Total
FROM
(
SELECT
@rnk:=IF(@ds=Dataset AND @grp=`Group`, IF(@run=RunNumber, @rnk, @rnk+1), 1) AS Rnk
, @grp := `Group` as `Group`
, @ds := Dataset as Dataset
, @run := RunNumber as RunNumber
, Total
FROM your_table t
CROSS JOIN (SELECT @grp:=null, @ds:=null, @run:=null, @rnk := 0) var
ORDER BY `Group`, Dataset, RunNumber DESC
) q
WHERE Rnk = 1
ORDER BY `Group`, Dataset;

--
-- DENSE_RANK = 1
-- MySql 8 and beyond.
--
SELECT *
FROM
(
SELECT *
, DENSE_RANK() OVER (PARTITION BY `Group`, Dataset ORDER BY RunNumber DESC) AS rnk
FROM your_table
) q
WHERE rnk = 1
ORDER BY `Group`, Dataset;


Related Topics



Leave a reply



Submit