Get top row(s) with highest value, with ties
The first query fails if any row has quantity IS NULL
(as Gordon demonstrates).
The second query only fails if all rows have quantity IS NULL
. So it should be usable in most cases. (And it's faster.)
Postgres 13 or newer
Use the standard SQL clause WITH TIES
:
SELECT id
FROM product
ORDER BY quantity DESC NULLS LAST
FETCH FIRST 1 ROWS WITH TIES;
db<>fiddle here
Works with any amount of NULL
values.
The manual:
SQL:2008 introduced a different syntax to achieve the same result,
which PostgreSQL also supports. It is:OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } { ONLY | WITH TIES }In this syntax, the
start
orcount
value is required by the standard
to be a literal constant, a parameter, or a variable name; as a
PostgreSQL extension, other expressions are allowed, but will
generally need to be enclosed in parentheses to avoid ambiguity. Ifcount
is omitted in aFETCH
clause, it defaults to 1. TheWITH TIES
option is used to return any additional rows that tie for the last
place in the result set according to theORDER BY
clause;ORDER BY
is
mandatory in this case.ROW
andROWS
as well asFIRST
andNEXT
are
noise words that don't influence the effects of these clauses.
Notably, WITH TIES
cannot be used with the (non-standard) short syntax LIMIT n
.
It's the fastest possible solution. Faster than either of your current queries. More important for performance: have an index on (quantity)
. Or a more specialized covering index to allow index-only scans (a bit faster, yet):
CREATE INDEX ON product (quantity DESC NULLS LAST) INCLUDE (id);
See:
- Do covering indexes in PostgreSQL help JOIN columns?
We need NULLS LAST
to keep NULL
values last in descending order. See:
- Sort by column ASC, but NULL values first?
Postgres 12 or older
A NULL-safe query:
SELECT id, quantity
FROM product
WHERE quantity IS NOT DISTINCT FROM (SELECT MAX(quantity) FROM product);
Or, probably faster:
SELECT id, quantity
FROM (
SELECT *, rank() OVER (ORDER BY quantity DESC NULLS LAST) AS rnk
FROM product
) sub
WHERE rnk = 1;
See:
- PostgreSQL equivalent for TOP n WITH TIES: LIMIT "with ties"?
Faster alternatives for big tables:
- Equivalent for FETCH FIRST WITH TIES in PostgreSQL 11 with comparable performance
Python pandas: Retrieve highest value of each rows with TIES
Calculate the max values per Row Number
using groupby.transform('max')
and then filter:
df[df.Rank == df.Rank.groupby(df['Row Number']).transform('max')]
Row Number Rank
0 702 20
1 702 20
3 100 5
4 100 5
SQL query with Highest value with tie
Use RANK()
window function:
SELECT t.dept_name, t.name, t.tot_cred
FROM (
SELECT dept_name, name, tot_cred,
RANK() OVER(PARTITION BY dept_name ORDER BY tot_cred DESC) rn
FROM university.student
) t
WHERE t.rn = 1
This is an alternative in case you can't use window functions:
SELECT s.dept_name, s.name, s.tot_cred
FROM university.student s
INNER JOIN (
SELECT dept_name, MAX(tot_cred) tot_cred
FROM university.student
GROUP BY dept_name
) t ON t.dept_name = s.dept_name AND s.tot_cred = t.tot_cred
or with NOT EXISTS
:
SELECT s.dept_name, s.name, s.tot_cred
FROM university.student s
WHERE NOT EXISTS (
SELECT 1 FROM university.student
WHERE dept_name = s.dept_name AND tot_cred > s.tot_cred
)
Get records with max value for each group of grouped SQL results
There's a super-simple way to do this in mysql:
select *
from (select * from mytable order by `Group`, age desc, Person) x
group by `Group`
This works because in mysql you're allowed to not aggregate non-group-by columns, in which case mysql just returns the first row. The solution is to first order the data such that for each group the row you want is first, then group by the columns you want the value for.
You avoid complicated subqueries that try to find the max()
etc, and also the problems of returning multiple rows when there are more than one with the same maximum value (as the other answers would do)
Note: This is a mysql-only solution. All other databases I know will throw an SQL syntax error with the message "non aggregated columns are not listed in the group by clause" or similar. Because this solution uses undocumented behavior, the more cautious may want to include a test to assert that it remains working should a future version of MySQL change this behavior.
Version 5.7 update:
Since version 5.7, the sql-mode
setting includes ONLY_FULL_GROUP_BY
by default, so to make this work you must not have this option (edit the option file for the server to remove this setting).
SQL select only rows with max value on a column
At first glance...
All you need is a GROUP BY
clause with the MAX
aggregate function:
SELECT id, MAX(rev)
FROM YourTable
GROUP BY id
It's never that simple, is it?
I just noticed you need the content
column as well.
This is a very common question in SQL: find the whole data for the row with some max value in a column per some group identifier. I heard that a lot during my career. Actually, it was one the questions I answered in my current job's technical interview.
It is, actually, so common that Stack Overflow community has created a single tag just to deal with questions like that: greatest-n-per-group.
Basically, you have two approaches to solve that problem:
Joining with simple group-identifier, max-value-in-group
Sub-query
In this approach, you first find the group-identifier, max-value-in-group
(already solved above) in a sub-query. Then you join your table to the sub-query with equality on both group-identifier
and max-value-in-group
:
SELECT a.id, a.rev, a.contents
FROM YourTable a
INNER JOIN (
SELECT id, MAX(rev) rev
FROM YourTable
GROUP BY id
) b ON a.id = b.id AND a.rev = b.rev
Left Joining with self, tweaking join conditions and filters
In this approach, you left join the table with itself. Equality goes in the group-identifier
. Then, 2 smart moves:
- The second join condition is having left side value less than right value
- When you do step 1, the row(s) that actually have the max value will have
NULL
in the right side (it's aLEFT JOIN
, remember?). Then, we filter the joined result, showing only the rows where the right side isNULL
.
So you end up with:
SELECT a.*
FROM YourTable a
LEFT OUTER JOIN YourTable b
ON a.id = b.id AND a.rev < b.rev
WHERE b.id IS NULL;
Conclusion
Both approaches bring the exact same result.
If you have two rows with max-value-in-group
for group-identifier
, both rows will be in the result in both approaches.
Both approaches are SQL ANSI compatible, thus, will work with your favorite RDBMS, regardless of its "flavor".
Both approaches are also performance friendly, however your mileage may vary (RDBMS, DB Structure, Indexes, etc.). So when you pick one approach over the other, benchmark. And make sure you pick the one which make most of sense to you.
How can I SELECT the first row with MAX(Column value)?
Why does your second query not work...
select Item_No,
Quantity
from Rec_details
group by Item_No,
Quantity
having Quantity=max(Quantity);
You are grouping by both Item_No
and Quantity
and the Item_No
appears to be the primary key and contain unique values so each group will only contain one row. The HAVING
clause looks within the group so it will check that the value of quantity
is the maximum value within that group but there is only one value within the group so this will always be true. Your query is the equivalent of:
SELECT DISTINCT
Item_No,
Quantity
FROM Rec_details;
Some other ways to get the maximum value:
SQL Fiddle
Oracle 11g R2 Schema Setup:
create table Rec_details (item_no, Quantity ) AS
SELECT 12507,1 FROM DUAL UNION ALL
SELECT 12549,4 FROM DUAL UNION ALL
SELECT 12100,8 FROM DUAL UNION ALL
SELECT 12501,2 FROM DUAL UNION ALL
SELECT 12201,7 FROM DUAL UNION ALL
SELECT 12509,3 FROM DUAL UNION ALL
SELECT 12080,1 FROM DUAL;
Query 1 - Get one row with maximum quantity
and latest item_no
(using 1 table scan):
SELECT MAX( item_no ) KEEP ( DENSE_RANK LAST ORDER BY Quantity ) AS Item_no,
MAX( Quantity ) AS Quantity
FROM Rec_Details
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 2 - Get one row with maximum quantity
and latest item_no
(using 1 table scan):
SELECT *
FROM (
SELECT *
FROM Rec_details
ORDER BY Quantity DESC, Item_no DESC
)
WHERE ROWNUM = 1
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 3 - Get all rows with maximum quantity
(using 1 table scan):
SELECT Item_no, Quantity
FROM (
SELECT r.*,
RANK() OVER ( ORDER BY Quantity DESC ) AS rnk
FROM Rec_details r
)
WHERE rnk = 1
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 4 - Get all rows with maximum quantity
(using 2 table scans):
SELECT Item_no,
Quantity
FROM Rec_Details
WHERE Quantity = ( SELECT MAX( Quantity ) FROM Rec_Details )
Results:
| ITEM_NO | QUANTITY |
|---------|----------|
| 12100 | 8 |
Query 5 - Get one row with maximum Quantity
and latest Item_No
using Oracle 12 Syntax (1 table scan):
SELECT *
FROM Rec_Details
ORDER BY Quantity DESC, Item_No DESC
FETCH FIRST ROW ONLY;
Query 5 - Get all rows with maximum Quantity
using Oracle 12 Syntax (1 table scan):
SELECT *
FROM Rec_Details
ORDER BY Quantity DESC
FETCH FIRST ROW WITH TIES;
R - find out if a the column containing the max value of a row is tied with another column
An option would be to get the max
and check the count of max
elements by row
df$tie <- apply(df[1:5], 1, function(x) sum(x == max(x)) > 1)
df$tie
#[1] FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
Or using rowSums
rowSums(df[1:5] == do.call(pmax, df[1:5])) > 1
#[1] FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
NOTE: Both the methods are generalized and can take care of multiple conditions
How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?
You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten
table on BOTH fields:
SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
Select the row with the maximum value in each group
Here's a data.table
solution:
require(data.table) ## 1.9.2
group <- as.data.table(group)
If you want to keep all the entries corresponding to max values of pt
within each group:
group[group[, .I[pt == max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
If you'd like just the first max value of pt
:
group[group[, .I[which.max(pt)], by=Subject]$V1]
# Subject pt Event
# 1: 1 5 2
# 2: 2 17 2
# 3: 3 5 2
In this case, it doesn't make a difference, as there aren't multiple maximum values within any group in your data.
Related Topics
MySQL Equivalent of Decode Function in Oracle
How to Get Oracle Create Table Statement in SQL*Plus
Get the Default Values of Table Columns in Postgres
Rbar VS. Set Based Programming for SQL
"In" Clause Limitation in SQL Server
Are There Any Way to Execute a Query Inside the String Value (Like Eval) in Postgresql
Can a SQL Server Table Have Two Identity Columns
SQL Statement to Get Column Type
SQL "Select Where Not in Subquery" Returns No Results
Fastest Check If Row Exists in Postgresql
What Does the Colon Sign ":" Do in a SQL Query
Ssis Package Not Wanting to Fetch Metadata of Temporary Table
Check for File Exists or Not in SQL Server
How to Deal with Concurrent Updates in Databases
Insert Update Stored Proc on SQL Server
Postgres Window Function and Group by Exception