SQL: How To Select Earliest Row
Simply use min()
SELECT company, workflow, MIN(date)
FROM workflowTable
GROUP BY company, workflow
Select row with the earliest date
You can use window functions for this, either ROW_NUMBER()
or MIN()
. The idea is to partition the rows by the ID - OVER (PARTITION BY id)
- and then either assign row numbers (ordered by the datetime) or find the minimum datetime per ID.
Solution with ROW_NUMBER()
:
; WITH cte AS
( SELECT id, datetime, string,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY datetime)
FROM tableX
)
SELECT id, datetime, string
FROM cte
WHERE rn = 1 ;
and with MIN()
:
; WITH cte AS
( SELECT id, datetime, string,
min_datetime = MIN(datetime) OVER (PARTITION BY id)
FROM tableX
)
SELECT id, datetime, string
FROM cte
WHERE datetime = min_datetime ;
The second version has slightly different behaviour. If there are two (or more) rows with exactly same datetime for an ID, they will be both in the result.
Select earliest date and count rows in table with duplicate IDs
You would use aggregation:
select id, min(create_date), count(*)
from table1
group by id;
I find it amusing that you want to use window functions -- which are considered more advanced -- when lowly aggregation suffices.
Get top 1 row of each group
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1
If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead
As for normalised or not, it depends if you want to:
- maintain status in 2 places
- preserve status history
- ...
As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.
Selecting the first N rows of each group ordered by date
As well as the row_number
solution, another option is CROSS APPLY(SELECT TOP
:
SELECT m.masterid,
d.detailid,
m.numbers,
d.date_time,
d.value
FROM masters AS m
CROSS APPLY (
SELECT TOP (3) *
FROM details AS d
WHERE d.date_time >= '2020-01-01'
AND m.masterid = d.masterid
) AS d
WHERE m.tags LIKE '%Tag2%'
ORDER BY m.masterid DESC,
d.date_time;
This may be faster or slower than row_number
, mostly depending on cardinalities (quantity of rows) and indexing.
If indexing is good and it's a small number of rows it will usually be faster. If the inner table needs sorting or you are anyway selecting most rows then use row_number
.
Related Topics
Modify Table: How to Change 'Allow Nulls' Attribute from Not Null to Allow Null
Multiple Full Outer Join on Multiple Tables
Prevent Recursive Cte Visiting Nodes Multiple Times
Fast Relational Method of Storing Tree Data (For Instance Threaded Comments on Articles)
Concatenate/Merge Array Values During Grouping/Aggregation
What Is the Equivalent of 'Go' in MySQL
Search for a Particular String in Oracle Clob Column
Is 20 SQL Queries Per Page Load Really Considered a Lot
Take VS First Performance in Ruby on Rails
Sql-Server: Error - Exclusive Access Could Not Be Obtained Because the Database Is in Use
Declare Variable for a Query String
SQL Server 2008: Delete Duplicate Rows
How to Schedule a SQL Job in Microsoft Azure SQL Database
How to Update All Columns with Insert ... on Conflict ...