php mysql Group By to get latest record, not first record
If you select attributes that are not used in the group clause, and are not aggregates, the result is unspecified. I.e you don't know which rows the other attributes are selected from. (The sql standard does not allow such queries, but MySQL is more relaxed).
The query should then be written e.g. as
SELECT post_id, forum_id, topic_id
FROM posts p
WHERE post_time =
(SELECT max(post_time) FROM posts p2
WHERE p2.topic_id = p.topic_id
AND p2.forum_id = p.forum_id)
GROUP BY forum_id, topic_id, post_id
ORDER BY post_time DESC
LIMIT 5;
or SELECT post_id, forum_id, topic_id FROM posts
NATURAL JOIN
(SELECT forum_id, topic_id, max(post_time) AS post_time
FROM posts
GROUP BY forum_id, topic_id) p
ORDER BY post_time
LIMIT 5;
Retrieving the last record in each group - MySQL
MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:
WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;
This and other approaches to finding groupwise maximal rows are illustrated in the MySQL manual.Below is the original answer I wrote for this question in 2009:
I write the solution this way:
SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;
Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts
table. This is running on MySQL 5.0.75 on my Macbook Pro 2.40GHz.
I'll write a query to find the most recent post for a given user ID (mine).
First using the technique shown by @Eric with the GROUP BY
in a subquery:
SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;
1 row in set (1 min 17.89 sec)
Even the EXPLAIN
analysis takes over 16 seconds:+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 76756 | |
| 1 | PRIMARY | p1 | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY | 8 | p2.maxpostid | 1 | Using where |
| 2 | DERIVED | pi | index | NULL | OwnerUserId | 8 | NULL | 1151268 | Using index |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
3 rows in set (16.09 sec)
Now produce the same query result using my technique with LEFT JOIN
:SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;
1 row in set (0.28 sec)
The EXPLAIN
analysis shows that both tables are able to use their indexes:+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | ref | OwnerUserId | OwnerUserId | 8 | const | 1384 | Using index |
| 1 | SIMPLE | p2 | ref | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | Using where; Using index; Not exists |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 rows in set (0.00 sec)
Here's the DDL for my
Posts
table:CREATE TABLE `posts` (
`PostId` bigint(20) unsigned NOT NULL auto_increment,
`PostTypeId` bigint(20) unsigned NOT NULL,
`AcceptedAnswerId` bigint(20) unsigned default NULL,
`ParentId` bigint(20) unsigned default NULL,
`CreationDate` datetime NOT NULL,
`Score` int(11) NOT NULL default '0',
`ViewCount` int(11) NOT NULL default '0',
`Body` text NOT NULL,
`OwnerUserId` bigint(20) unsigned NOT NULL,
`OwnerDisplayName` varchar(40) default NULL,
`LastEditorUserId` bigint(20) unsigned default NULL,
`LastEditDate` datetime default NULL,
`LastActivityDate` datetime default NULL,
`Title` varchar(250) NOT NULL default '',
`Tags` varchar(150) NOT NULL default '',
`AnswerCount` int(11) NOT NULL default '0',
`CommentCount` int(11) NOT NULL default '0',
`FavoriteCount` int(11) NOT NULL default '0',
`ClosedDate` datetime default NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;
Note to commenters: If you want another benchmark with a different version of MySQL, a different dataset, or different table design, feel free to do it yourself. I have shown the technique above. Stack Overflow is here to show you how to do software development work, not to do all the work for you.
Mysql Group By get latest record with Count
Try joining to a subquery:
SELECT c1.id, c1.emp_id, c1.created_at, c2.cnt
FROM customer c1
INNER JOIN
(
SELECT emp_id, MAX(created_at) AS max_created_at, COUNT(*) AS cnt
FROM customer
GROUP BY emp_id
) c2
ON c1.emp_id = c2.emp_id AND c1.created_at = c2.max_created_at;
SELECT latest record group by one column
You can use variables for this:
SELECT location, parameter, datetime, value
FROM (
SELECT location, parameter, datetime, value,
@seq := IF(@loc = location, @seq + 1,
IF(@loc := location, 1, 1)) AS seq
FROM mytable
CROSS JOIN (SELECT @seq := 0, @loc = '') AS vars
ORDER By location, datetime desc, value desc) AS t
WHERE t.seq = 1
The inner query has an ORDER BY
clause that returns the required latest-per-group record first within its own slice. The variable @seq
is set to 1 for this first record using the logic implemented by the IF
functions. The outer query simply filters the derived table to get the expected record for each location
slice.Demo here
How to get the latest record in each group using GROUP BY?
You should find out last timestamp
values in each group (subquery), and then join this subquery to the table -
SELECT t1.* FROM messages t1
JOIN (SELECT from_id, MAX(timestamp) timestamp FROM messages GROUP BY from_id) t2
ON t1.from_id = t2.from_id AND t1.timestamp = t2.timestamp;
MySQL Query For Latest Record Before a Date
Use a WHERE clause to limit your search to records before the date in question, then sort in descending order by the timestamp (sorting by ID will probably work as well) and take the first record with LIMIT 1.
SELECT * FROM your_table
WHERE ts_column < '2017-04-24 20:10:00'
ORDER BY ts_column DESC LIMIT 1
I improvised the name of your table & timestamp column, but this should give you the general idea. how to select latest record from table by using Group by statements
You can do this without grouping at all - simply by left joining
the messages table to itself, with the predicate being the same sender, and later timestamp. If there is no later timestamp, you will end up with null values in the second table, meaning you've identified the most recent message.
select s.user_name as `from`, r.user_name as `to`, m1.msg, m1.time
from messages m1
left join messages m2
on m1.time < m2.time and m1.sender = m2.sender
inner join users s
on m1.sender = s.u_id
inner join users r
on m1.receiver = r.u_id
where m2.sender is null;
If you absolutely want to use group by
, you can do it by first finding the max(time)
for each sender, and joining that result back to the messages and users table, like so:select s.user_name as `from`, r.user_name as `to`, m.msg, m.time
from messages m
inner join users s
on m.sender = s.u_id
inner join users r
on m.receiver = r.u_id
inner join (
select sender, max(`time`) as ts
from messages
group by sender
) q on m.sender = q.sender and m.time = q.ts
Both queries will give you identical results mysql: group by and get latest record
MYSQL by default doesn't force you to put all columns which are not included in aggregated function in the Group By
clause. This can return strange results.
Try the following query.
SELECT cus_id,
ROUND(SUM(credit_in)-SUM(credit_out), 2) as balance,
max(date_added) latest_transaction_date
FROM `customer_wallet`
GROUP BY cus_id
HAVING ROUND(SUM(credit_in)-SUM(credit_out), 2) < 0
If you want to read more about the Group By
in MySQL you can check this blog post: Debunking GROUP BY myths. It's quite old, but still interesting if you are new to MySQL.
Related Topics
Call to Undefined Function Curl_Init() - with Wamp
Remove Namespace from Xml Using PHP
Laravel Unexpected Redirects ( 302 )
How to Disable the Back Browser Button After User Press Logout and Destroy Session
How to Remove Exif from a Jpg Without Losing Image Quality
Programmatically Create Image from Web-Page or a Single Div
Difference Between Buffered and Unbuffered Queries
Can't Change Font Size for Gd Imagestring()
How to Get the Execution Time of a MySQL Query from PHP
Why Is MySQLi_Insert_Id() Always Returning 0
Include Just Files in Scandir Array
Convert Windows Timestamp to Date Using PHP on a Linux Box
PHP Curl: Curlopt_Connecttimeout VS Curlopt_Timeout
Phpstorm: How to Add Method Stubs from a Pecl Library That PHPstorm Doesn't Currently Support