SQL Server: Only Last Entry in Group By

SQL Server: Only last entry in GROUP BY

An alternative solution, which may give you better performance (test both ways and check the execution plans):

SELECT
T1.id,
T1.business_key,
T1.result
FROM
dbo.My_Table T1
LEFT OUTER JOIN dbo.My_Table T2 ON
T2.business_key = T1.business_key AND
T2.id > T1.id
WHERE
T2.id IS NULL

This query assumes that the ID is a unique value (at least for any given business_key) and that it is set to NOT NULL.

Retrieving last record in each group from database - SQL Server 2005/2008

;with cteRowNumber as (
select COMPUTERNAME, SERIALNUMBER, USERNAME, LASTIP, LASTUPDATE, SOURCE,
row_number() over(partition by COMPUTERNAME order by LASTUPDATE desc) as RowNum
from YourTable
)
select COMPUTERNAME, SERIALNUMBER, USERNAME, LASTIP, LASTUPDATE, SOURCE
from cteRowNumber
where RowNum = 1

How to get the first and the last record per group in SQL Server 2008?

How about using ROW_NUMBER:

SQL Fiddle

WITH Cte AS(
SELECT *,
RnAsc = ROW_NUMBER() OVER(PARTITION BY [group] ORDER BY val),
RnDesc = ROW_NUMBER() OVER(PARTITION BY [group] ORDER BY val DESC)
FROM tbl
)
SELECT
id, [group], val, start, [end]
FROM Cte
WHERE
RnAsc = 1 OR RnDesc = 1
ORDER BY [group], val

MSSQL: Only last entry in GROUP BY (with id)

SELECT t.*
FROM
(
SELECT *, ROW_NUMBER() OVER
(
PARTITION BY [business_key]
ORDER BY [date] DESC
) AS [RowNum]
FROM yourTable
) AS t
WHERE t.[RowNum] = 1

Find last row in group by query-SQL Server

select ID, Name 
FROM (select ID, Name, -- add other columns here
ROW_NUMBER() over (partition by Name ORDER BY ID DESC) as MAX_ID
from Customer) x
WHERE MAX_ID = 1

How to get the last record per group in SQL

You can use a ranking function and a common table expression.

WITH e AS
(
SELECT *,
ROW_NUMBER() OVER
(
PARTITION BY ApplicationId
ORDER BY CONVERT(datetime, [Date], 101) DESC, [Time] DESC
) AS Recency
FROM [Event]
)
SELECT *
FROM e
WHERE Recency = 1

Get the latest records per Group By SQL

The rank window clause allows you to, well, rank rows according to some partitioning, and then you could just select the top ones:

SELECT oDate, oName, oItem, oQty, oRemarks
FROM (SELECT oDate, oName, oItem, oQty, oRemarks,
RANK() OVER (PARTITION BY oName ORDER BY oDate DESC) AS rk
FROM my_table) t
WHERE rk = 1

Retrieving the last record in each group - MySQL

MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:

WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;

This and other approaches to finding groupwise maximal rows are illustrated in the MySQL manual.

Below is the original answer I wrote for this question in 2009:


I write the solution this way:

SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;

Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.

For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.

I'll write a query to find the most recent post for a given user ID (mine).

First using the technique shown by @Eric with the GROUP BY in a subquery:

SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;

1 row in set (1 min 17.89 sec)

Even the EXPLAIN analysis takes over 16 seconds:

+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 76756 | |
| 1 | PRIMARY | p1 | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY | 8 | p2.maxpostid | 1 | Using where |
| 2 | DERIVED | pi | index | NULL | OwnerUserId | 8 | NULL | 1151268 | Using index |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
3 rows in set (16.09 sec)

Now produce the same query result using my technique with LEFT JOIN:

SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;

1 row in set (0.28 sec)

The EXPLAIN analysis shows that both tables are able to use their indexes:

+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | ref | OwnerUserId | OwnerUserId | 8 | const | 1384 | Using index |
| 1 | SIMPLE | p2 | ref | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | Using where; Using index; Not exists |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 rows in set (0.00 sec)

Here's the DDL for my Posts table:

CREATE TABLE `posts` (
`PostId` bigint(20) unsigned NOT NULL auto_increment,
`PostTypeId` bigint(20) unsigned NOT NULL,
`AcceptedAnswerId` bigint(20) unsigned default NULL,
`ParentId` bigint(20) unsigned default NULL,
`CreationDate` datetime NOT NULL,
`Score` int(11) NOT NULL default '0',
`ViewCount` int(11) NOT NULL default '0',
`Body` text NOT NULL,
`OwnerUserId` bigint(20) unsigned NOT NULL,
`OwnerDisplayName` varchar(40) default NULL,
`LastEditorUserId` bigint(20) unsigned default NULL,
`LastEditDate` datetime default NULL,
`LastActivityDate` datetime default NULL,
`Title` varchar(250) NOT NULL default '',
`Tags` varchar(150) NOT NULL default '',
`AnswerCount` int(11) NOT NULL default '0',
`CommentCount` int(11) NOT NULL default '0',
`FavoriteCount` int(11) NOT NULL default '0',
`ClosedDate` datetime default NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;

Note to commenters: If you want another benchmark with a different version of MySQL, a different dataset, or different table design, feel free to do it yourself. I have shown the technique above. Stack Overflow is here to show you how to do software development work, not to do all the work for you.

Select last records from table using group by

Assuming that the start and end dates will always be the highest values then you need to drop some of the columns from the GROUP BY (having all the columns in the GROUP BY is kinda like using DISTINCT) and use an aggregate function on the other column:

SELECT UserId,
MAX(StartDate) AS StartDate,
MAX(EndDate) AS EndDate
FROM usersworktime
GROUP BY UserId;

Otherwise, if that isn't the case, you can use a CTE and ROW_NUMBER:

WITH CTE AS(
SELECT UserID,
StartDate,
EndDate,
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY UsersWordTimeID DESC) AS RN
FROM usersworktime)
SELECT UserID,
StartDate,
EndDate
FROM CTE
WHERE RN = 1;


Related Topics



Leave a reply



Submit