Select first row in each GROUP BY group?
On databases that support CTE and windowing functions:
WITH summary AS (
SELECT p.id,
p.customer,
p.total,
ROW_NUMBER() OVER(PARTITION BY p.customer
ORDER BY p.total DESC) AS rank
FROM PURCHASES p)
SELECT *
FROM summary
WHERE rank = 1
Supported by any database:
But you need to add logic to break ties:
SELECT MIN(x.id), -- change to MAX if you want the highest
x.customer,
x.total
FROM PURCHASES x
JOIN (SELECT p.customer,
MAX(total) AS max_total
FROM PURCHASES p
GROUP BY p.customer) y ON y.customer = x.customer
AND y.max_total = x.total
GROUP BY x.customer, x.total
Get top 1 row of each group
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1
If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead
As for normalised or not, it depends if you want to:
- maintain status in 2 places
- preserve status history
- ...
As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.
Selecting first row per group
SELECT a, b, c
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY a ORDER BY b, c) rn
FROM mytable
) q
WHERE rn = 1
ORDER BY
a
or
SELECT mi.*
FROM (
SELECT DISTINCT a
FROM mytable
) md
CROSS APPLY
(
SELECT TOP 1 *
FROM mytable mi
WHERE mi.a = md.a
ORDER BY
b, c
) mi
ORDER BY
a
Create a composite index on (a, b, c)
for the queries to work faster.
Which one is more efficient depends on your data distribution.
If you have few distinct values of a
but lots of records within each a
, the second query would be better.
You could improve it even more by creating an indexed view:
CREATE VIEW v_mytable_da
WITH SCHEMABINDING
AS
SELECT a, COUNT_BIG(*) cnt
FROM dbo.mytable
GROUP BY
a
GO
CREATE UNIQUE CLUSTERED INDEX
pk_vmytableda_a
ON v_mytable_da (a)
GO
SELECT mi.*
FROM v_mytable_da md
CROSS APPLY
(
SELECT TOP 1 *
FROM mytable mi
WHERE mi.a = md.a
ORDER BY
b, c
) mi
ORDER BY
a
BigQuery/SQL: Select first row of each group
I believe you are looking for the function [FIRST_VALUE][1]
?
SELECT
landing_page,
FIRST_VALUE(URL)
OVER ( PARTITION BY landing_page ORDER BY Page_Type DESC) AS first_url
FROM `xxxx.TEST.draft`
How do I select the first row per group in an SQL Query?
declare @sometable table ( foo int, bar int, value int )
insert into @sometable values (47, 1, 100)
insert into @sometable values (47, 0, 10)
insert into @sometable values (47, 2, 10)
insert into @sometable values (46, 0, 100)
insert into @sometable values (46, 1, 10)
insert into @sometable values (46, 2, 10)
insert into @sometable values (44, 0, 2)
WITH cte AS
(
SELECT Foo, Bar, SUM(value) AS SumValue, ROW_NUMBER() OVER(PARTITION BY Foo ORDER BY FOO DESC, SUM(value) DESC) AS RowNumber
FROM @SomeTable
GROUP BY Foo, Bar
)
SELECT *
FROM cte
WHERE RowNumber = 1
get first row fo each group SQL
If you are runing MySQL 8.0, you can use RANK()
in a subquery to rank the records by subject count for each gender, and filter on the top record per group in the outer query (if there are top ties, RANK()
preserves them):
SELECT gender, subject, no_of_selections
FROM (
SELECT
se.gender,
su.subject,
COUNT(*) as no_of_selections,
RANK() OVER(PARTITION BY se.gender ORDER BY COUNT(*) DESC) rn
FROM selection se
JOIN subjects su ON se.subjectID = su.subjectID
GROUP BY se.subjectID, se.gender, su.subject
) t
WHERE rn = 1
ORDER BY gender DESC
In earlier versions, where window functions are not availabe, one option is to filter with a having
clause that returns to top count per gender:
SELECT
se.gender,
su.subject,
COUNT(*) as no_of_selections
FROM selection se
JOIN subjects su ON se.subjectID = su.subjectID
GROUP BY se.subjectID, se.gender, su.subject
HAVING COUNT(*) = (
SELECT COUNT(*)
FROM selection se1
WHERE se1.gender = se.gender
GROUP BY se1.subjectID, se1.gender
ORDER BY COUNT(*) DESC
LIMIT 1
)
Notes:
I changed the table aliases to make them more meaningful
You should the
subject
column to theGROUP BY
clause to make your query runnable under sql modeONLY_FULL_GROUP_MODE
, which is by default enabled starting MySQL 5.7
SQL selecting first record per group
GROUP BY u.d
(without also listing u1
, u2
, u3
) would only work if u.d
was the PRIMARY KEY
(which it is not, and also wouldn't make sense in your scenario). See:
- Is it possible to have an SQL query that uses AGG functions in this way?
I suggest DISTINCT ON
in a subquery on UTable
instead:
SELECT o.d, u.u1, u.u2, u.u3, o.n
FROM (
SELECT DISTINCT ON (u.d)
u.d, u.u1, u.u2, u.u3
FROM UTable u
WHERE u.gid = 3
AND u.gt = 'dog night'
ORDER BY u.d, u.timestamp
) u
JOIN OTable o USING (gid, gt, d);
See:
- Select first row in each GROUP BY group?
If UTable
is big, at least a multicolumn index on (gid, gt)
is advisable. Same for OTable
.
Maybe even on (gid, gt, d)
. Depends on data types, cardinalities, ...
Select first row in each group in sql
You can use distinct on
directly with group by
:
select distinct on ("Country") Sum("Price"), "In-app Product", "Country"
from cleandatase
group by "Country", "In-app Product"
order by "Country", Sum("Price") desc;
Note: As Thorsten points out, if there are ties and you want all the ties, then distinct on
is not the simplest solution.
SQL Server Group By Query Select first row each group
You can GROUP BY StudyID, Year
and then in an outer query select the first row from each StudyID, Year
group:
SELECT StudyID, Year, minAccess1, minAccess2, minAccess3
FROM (
SELECT StudyID, Year, min(Access1) minAccess1, min(Access2) minAccess2,
min(Access3) minAccess3,
ROW_NUMBER() OVER (PARTITION BY StudyID ORDER BY Year DESC) AS rn
FROM mytable
GROUP BY StudyID, Year ) t
WHERE t.rn = 1
ROW_NUMBER
is used to assign an ordering number to each StudyID
group according to Year
values. The row with the maximum Year
value is assigned a rn = 1
.
Related Topics
How to Change MySQL Table Names in Linux Server to Be Case Insensitive
How to Request a Random Row in Sql
Best Way to Do Multi-Row Insert in Oracle
Return Row With the Max Value of One Column Per Group
Difference Between Exists and in in Sql
How to Select the First Day of a Month in Sql
Commit Data in a MySQL Container
Ora-00979 Not a Group by Expression
SQL Server - Best Way to Get Identity of Inserted Row
How to Check If a Table Exists in a Given Schema
Select First Row of Every Group in Sql