SQL select only rows with max value on a column
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
SQL : Keep ONE row with max value on a column depending on value of another column
One method is rank()
or row_number()
;
select t.*
from (select t.*,
row_number() over (partition by id order by col1 desc, col2 asc) as seqnum
from t
) t
where seqnum = 1;
You would use rank()
if you want multiple rows when there are duplicate max col1/ min col2 for the same id.
how to get a Row with Max value of a column?
You can use a correlated subquery:
select t.*
from mytable t
where t.srno = (select max(srno) from mytable t1 where t1.p_id = t.p_id)
With an index on (p_id, srno)
, this should be an efficient solution.
Anoter common solution is to use row_number()
:
select pid, name, srno, rate
from (
select t.*, row_number() over(partition by p_id order by srno desc) rn
from mytable t
) t
where rn = 1
How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?
You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten
table on BOTH fields:
SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
Select rows with Max(Column Value) for each unique combination of two other columns
In MySQL 5.x you can use a sub-query.
SELECT *
FROM your_table
WHERE (`Group`, Dataset, RunNumber) IN (
SELECT `Group`, Dataset, MAX(RunNumber) AS MaxRunNumber
FROM your_table
GROUP BY `Group`, Dataset
);
Test on db<>fiddle here
Alternatives
--
-- LEFT JOIN on bigger
--
SELECT t.*
FROM your_table t
LEFT JOIN your_table t2
ON t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
WHERE t2.RunNumber IS NULL
ORDER BY t.`Group`, t.Dataset;
--
-- where NOT EXISTS on bigger
--
SELECT *
FROM your_table t
WHERE NOT EXISTS (
SELECT 1
FROM your_table t2
WHERE t2.`Group` = t.`Group`
AND t2.Dataset = t.Dataset
AND t2.RunNumber > t.RunNumber
)
ORDER BY `Group`, Dataset;
--
-- Emulating DENSE_RANK = 1 with variables
-- Works also in 5.x
--
SELECT RunNumber, `Group`, Dataset, Total
FROM
(
SELECT
@rnk:=IF(@ds=Dataset AND @grp=`Group`, IF(@run=RunNumber, @rnk, @rnk+1), 1) AS Rnk
, @grp := `Group` as `Group`
, @ds := Dataset as Dataset
, @run := RunNumber as RunNumber
, Total
FROM your_table t
CROSS JOIN (SELECT @grp:=null, @ds:=null, @run:=null, @rnk := 0) var
ORDER BY `Group`, Dataset, RunNumber DESC
) q
WHERE Rnk = 1
ORDER BY `Group`, Dataset;
--
-- DENSE_RANK = 1
-- MySql 8 and beyond.
--
SELECT *
FROM
(
SELECT *
, DENSE_RANK() OVER (PARTITION BY `Group`, Dataset ORDER BY RunNumber DESC) AS rnk
FROM your_table
) q
WHERE rnk = 1
ORDER BY `Group`, Dataset;
How can I SELECT the first row with MAX(Column value)?
Why does your second query not work...
select Item_No,
Quantity
from Rec_details
group by Item_No,
Quantity
having Quantity=max(Quantity);
You are grouping by both Item_No
and Quantity
and the Item_No
appears to be the primary key and contain unique values so each group will only contain one row. The HAVING
clause looks within the group so it will check that the value of quantity
is the maximum value within that group but there is only one value within the group so this will always be true. Your query is the equivalent of:
SELECT DISTINCT
Item_No,
Quantity
FROM Rec_details;
Some other ways to get the maximum value:
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table Rec_details (item_no, Quantity ) AS
SELECT 12507,1 FROM DUAL UNION ALL
SELECT 12549,4 FROM DUAL UNION ALL
SELECT 12100,8 FROM DUAL UNION ALL
SELECT 12501,2 FROM DUAL UNION ALL
SELECT 12201,7 FROM DUAL UNION ALL
SELECT 12509,3 FROM DUAL UNION ALL
SELECT 12080,1 FROM DUAL;
Query 1 - Get one row with maximum quantity
and latest item_no
(using 1 table scan):
SELECT MAX( item_no ) KEEP ( DENSE_RANK LAST ORDER BY Quantity ) AS Item_no,
MAX( Quantity ) AS Quantity
FROM Rec_Details
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 2 - Get one row with maximum quantity
and latest item_no
(using 1 table scan):
SELECT *
FROM (
SELECT *
FROM Rec_details
ORDER BY Quantity DESC, Item_no DESC
)
WHERE ROWNUM = 1
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 3 - Get all rows with maximum quantity
(using 1 table scan):
SELECT Item_no, Quantity
FROM (
SELECT r.*,
RANK() OVER ( ORDER BY Quantity DESC ) AS rnk
FROM Rec_details r
)
WHERE rnk = 1
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 4 - Get all rows with maximum quantity
(using 2 table scans):
SELECT Item_no,
Quantity
FROM Rec_Details
WHERE Quantity = ( SELECT MAX( Quantity ) FROM Rec_Details )
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 5 - Get one row with maximum Quantity
and latest Item_No
using Oracle 12 Syntax (1 table scan):
SELECT *
FROM Rec_Details
ORDER BY Quantity DESC, Item_No DESC
FETCH FIRST ROW ONLY;
Query 5 - Get all rows with maximum Quantity
using Oracle 12 Syntax (1 table scan):
SELECT *
FROM Rec_Details
ORDER BY Quantity DESC
FETCH FIRST ROW WITH TIES;
How to select pandas row with maximum value in one column, from a group of rows that share two common columns?
If not last 2 columns has same values use numpy.isclose
for select columns by some precision, also for performance is better select by DataFrame.loc
by mask and column name:
df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].max()
For index of maximum value use Series.idxmax
:
df.loc[np.isclose(df['col4'], 0.999141) & np.isclose(df['col5'], 0.000861559), 'col3'].idxmax()
For select by maximum col4
and minimum col5
use:
df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].max()
df.loc[df['col4'].eq(df['col4'].max()) & df['col5'].eq(df['col5'].min()), 'col3'].idxmax()
SELECT ONE Row with the MAX() value on a column
select top 1 * from newsletters where IsActive = 1 order by PublishDate desc
MySQL - How to select rows with max value of a field
If you want to get ties, then you can do something like this:
select s.*
from scores s
where s.score = (select max(s2.score) from scores s2 where s2.level = s.level);
You could get one row per level by aggregating this:
select s.level, s.score, group_concat(s.user_id)
from scores s
where s.score = (select max(s2.score) from scores s2 where s2.level = s.level)
group by s.level, s.score;
This combines the users (if there is more than one) into a single field.
Related Topics
How to Calculate the Last Day of the Month in SQL
SQL Server: Delete All the Rows of All the Tables
Default a Column with Empty String
Doing a Where in on Multiple Columns in Postgresql
SQL Update Query Syntax with Inner Join
Adding a Uniqueidentifier Column and Adding the Default to Generate New Guid
Creating New User/Login in SQL Azure
Using Alias in Query and Using It
Select One Row with the Max() Value on a Column
Assigning a Variable Inside an If Exists Clause
How to Store Longitude & Latitude as a Geography in SQL Server 2014
Oracle Pivot Query Gives Columns with Quotes Around the Column Names. What
Nolock VS. Transaction Isolation Level
Using Multiple Joins. Sum() Producing Wrong Value
Count Consecutive Duplicate Values in SQL
What Is the Fastest Way to Truncate Timestamps to 5 Minutes in Postgres
Fastest Way to Export Blobs from Table into Individual Files
Windowed Functions Can Only Appear in the Select or Order by Clauses