Get top 1 row of each group
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1
If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead
As for normalised or not, it depends if you want to:
- maintain status in 2 places
- preserve status history
- ...
As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.
Selecting the top n rows within a group by clause
CROSS APPLY is how you usually do this - http://msdn.microsoft.com/en-us/library/ms175156.aspx
EDIT - add example, something like this:
select
bar1.instrument
,bar2.*
from (
select distinct instrument from bar) as bar1
cross apply (
select top 5
bar2.instrument
,bar2.bar_dttm
,bar2.bar_open
,bar2.bar_close
from bar as bar2 where bar2.instrument = bar1.instrument) as bar2
Typically you would want to add an order by in there.
Edit - added distinct to the query, hopefully that gives you want you want.
Edit - added missing 'select' keyword at top. copy & paste bug FTL!
Get top n records for each group of grouped results
Here is one way to do this, using UNION ALL
(See SQL Fiddle with Demo). This works with two groups, if you have more than two groups, then you would need to specify the group
number and add queries for each group
:
(
select *
from mytable
where `group` = 1
order by age desc
LIMIT 2
)
UNION ALL
(
select *
from mytable
where `group` = 2
order by age desc
LIMIT 2
)
There are a variety of ways to do this, see this article to determine the best route for your situation:
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Edit:
This might work for you too, it generates a row number for each record. Using an example from the link above this will return only those records with a row number of less than or equal to 2:
select person, `group`, age
from
(
select person, `group`, age,
(@num:=if(@group = `group`, @num +1, if(@group := `group`, 1, 1))) row_number
from test t
CROSS JOIN (select @num:=0, @group:=null) c
order by `Group`, Age desc, person
) as x
where x.row_number <= 2;
See Demo
How do I select top N rows grouped by an ID in big query?
Use below
select *
from your_table
where in_stock
qualify 3 >= row_number() over(partition by aisle order by price desc)
if applied to sample data in your question - output is
Selecting the first N rows of each group ordered by date
As well as the row_number
solution, another option is CROSS APPLY(SELECT TOP
:
SELECT m.masterid,
d.detailid,
m.numbers,
d.date_time,
d.value
FROM masters AS m
CROSS APPLY (
SELECT TOP (3) *
FROM details AS d
WHERE d.date_time >= '2020-01-01'
AND m.masterid = d.masterid
) AS d
WHERE m.tags LIKE '%Tag2%'
ORDER BY m.masterid DESC,
d.date_time;
This may be faster or slower than row_number
, mostly depending on cardinalities (quantity of rows) and indexing.
If indexing is good and it's a small number of rows it will usually be faster. If the inner table needs sorting or you are anyway selecting most rows then use row_number
.
How to select top N rows for each group in a Entity Framework GroupBy with EF 3.1
Update (EF Core 6.0):
EF Core 6.0 added support for translating GroupBy
result set projection, so the original code for taking (key, items) now works as it should, i.e.
var query = context.Set<DbDocument>()
.Where(e => partnerIds.Contains(e.SenderId))
.GroupBy(e => e.SenderId)
.Select(g => new
{
g.Key,
Documents = g.OrderByDescending(e => e.InsertedDateTime).Take(10)
});
However flattening (via SelectMany
) is still unsupported, so you have to use the below workaround if you need such query shape.
Original (EF Core 3.0/3.1/5.0):
This is a common problem, unfortunately not supported by EF Core 3.0/3.1/5.0 query translator specifically for GroupBy
.
The workaround is to do the groping manually by correlating 2 subqueries - one for keys and one for corresponding data.
Applying it to your examples would be something like this.
If you need (key, items) pairs:
var query = context.Set<DbDocument>()
.Where(t => partnerIds.Contains(t.SenderId))
.Select(t => t.SenderId).Distinct() // <--
.Select(key => new
{
Key = key,
Documents =
context.Set<DbDocument>().Where(t => t.SenderId == key) // <--
.OrderByDescending(t => t.InsertedDateTime).Take(10)
.ToList() // <--
});
If you need just flat result set containing top N items per key:
var query = context.Set<DbDocument>()
.Where(t => partnerIds.Contains(t.SenderId))
.Select(t => t.SenderId).Distinct() // <--
.SelectMany(key => context.Set<DbDocument>().Where(t => t.SenderId == key) // <--
.OrderByDescending(t => t.InsertedDateTime).Take(10)
);
Select first `n` rows of a grouped query
The row_number
function is exactly what you're looking for:
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY entry_time DESC) AS rn
FROM user_gps_location
WHERE entry_time > '2020-09-01') t
WHERE rn <= 5
How to select top n row from each group after group by in pandas?
I'd recommend sorting your counts in descending order first, and you can call GroupBy.head
after—
(freq_df.sort_values('count', ascending=False)
.groupby(['open_year','open_month'], sort=False).head(5)
)
SELECT TOP 20 rows for each group
The easiest way would be to use the row_number()
window function to number the rows for each city according to their visitnumber descending and use that as a filter. This query should work in any SQL Server version from 2005 onwards.
select *
from (
select *, r = row_number() over (partition by City order by VisitNumber desc)
from your_table
) a
where r <= 20
and City in ('Washington', 'New York', 'Los Angeles')
This would select the top 20 items for each city specified in the where clause.
Related Topics
Sql Server 2008 - How to Convert Gmt(Utc) Datetime to Local Datetime
How to Use Isnull to All Column Names in SQL Server 2008
Are There Reasons for Not Storing Boolean Values in SQL as Bit Data Types
Sql Server Left Join with 'Or' Operator
Case Statement in Where Clause - SQL Server
Sql: Subquery Has Too Many Columns
Rails - Find with Condition in Rails 4
Are There Any Limits on Length of String in MySQL
How to Troubleshoot Ora-02049 and Lock Problems in General with Oracle
SQL Split String by Space into Table in Postgresql
Cannot Connect to Azure SQL Database, Even with Whitelisted Ip
Could Not Find Server 'server Name' in Sys.Servers. SQL Server 2014
Sql Parentheses Use in an or Clause
How to Exclude Tables from Sp_Msforeachtable
Why Is There No "Product()" Aggregate Function in Sql