Query last N related rows per row
Index
First, a multicolumn index will help:
CREATE INDEX observations_special_idx
ON observations(station_id, created_at DESC, id)
created_at DESC
is a slightly better fit, but the index would still be scanned backwards at almost the same speed without DESC
.
Assuming created_at
is defined NOT NULL
, else consider DESC NULLS LAST
in index and query:
- Sort by column ASC, but NULL values first?
The last column id
is only useful if you get an index-only scan out of it, which probably won't work if you add lots of new rows constantly. In this case, remove id
from the index.
Simpler query (still slow)
Simplify your query, the inner subselect doesn't help:
SELECT id
FROM (
SELECT station_id, id, created_at
, row_number() OVER (PARTITION BY station_id
ORDER BY created_at DESC) AS rn
FROM observations
) s
WHERE rn <= #{n} -- your limit here
ORDER BY station_id, created_at DESC;
Should be a bit faster, but still slow.
Fast query
- Assuming you have relatively few stations and relatively many observations per station.
- Also assuming
station_id
id defined asNOT NULL
.
To be really fast, you need the equivalent of a loose index scan (not implemented in Postgres, yet). Related answer:
- Optimize GROUP BY query to retrieve latest row per user
If you have a separate table of stations
(which seems likely), you can emulate this with JOIN LATERAL
(Postgres 9.3+):
SELECT o.id
FROM stations s
CROSS JOIN LATERAL (
SELECT o.id
FROM observations o
WHERE o.station_id = s.station_id -- lateral reference
ORDER BY o.created_at DESC
LIMIT #{n} -- your limit here
) o
ORDER BY s.station_id, o.created_at DESC;
If you don't have a table of stations
, the next best thing would be to create and maintain one. Possibly add a foreign key reference to enforce relational integrity.
If that's not an option, you can distill such a table on the fly. Simple options would be:
SELECT DISTINCT station_id FROM observations;
SELECT station_id FROM observations GROUP BY 1;
But either would need a sequential scan and be slow. Make Postgres use above index (or any btree index with station_id
as leading column) with a recursive CTE:
WITH RECURSIVE stations AS (
( -- extra pair of parentheses ...
SELECT station_id
FROM observations
ORDER BY station_id
LIMIT 1
) -- ... is required!
UNION ALL
SELECT (SELECT o.station_id
FROM observations o
WHERE o.station_id > s.station_id
ORDER BY o.station_id
LIMIT 1)
FROM stations s
WHERE s.station_id IS NOT NULL -- serves as break condition
)
SELECT station_id
FROM stations
WHERE station_id IS NOT NULL; -- remove dangling row with NULL
Use that as drop-in replacement for the stations
table in the above simple query:
WITH RECURSIVE stations AS (
(
SELECT station_id
FROM observations
ORDER BY station_id
LIMIT 1
)
UNION ALL
SELECT (SELECT o.station_id
FROM observations o
WHERE o.station_id > s.station_id
ORDER BY o.station_id
LIMIT 1)
FROM stations s
WHERE s.station_id IS NOT NULL
)
SELECT o.id
FROM stations s
CROSS JOIN LATERAL (
SELECT o.id, o.created_at
FROM observations o
WHERE o.station_id = s.station_id
ORDER BY o.created_at DESC
LIMIT #{n} -- your limit here
) o
WHERE s.station_id IS NOT NULL
ORDER BY s.station_id, o.created_at DESC;
This should still be faster than what you had by orders of magnitude.
db<>fiddle here
Old sqlfiddle
SQL Server SELECT LAST N Rows
You can do it by using the ROW NUMBER BY PARTITION Feature also. A great example can be found here:
I am using the Orders table of the Northwind database... Now let us retrieve the Last 5 orders placed by Employee 5:
SELECT ORDERID, CUSTOMERID, OrderDate
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY EmployeeID ORDER BY OrderDate DESC) AS OrderedDate,*
FROM Orders
) as ordlist
WHERE ordlist.EmployeeID = 5
AND ordlist.OrderedDate <= 5
Select last N rows from MySQL
You can do it with a sub-query:
SELECT * FROM
(
SELECT * FROM table ORDER BY id DESC LIMIT 50
) AS sub
ORDER BY id ASC;
This will select the last 50 rows from table
, and then order them in ascending order.
MySQL Query get the last N rows per Group
In MySQL, this is most easily done using variables:
select t.*
from (select t.*,
(@rn := if(@v = vehicle, @rn + 1,
if(@v := vehicle, 1, 1)
)
) as rn
from table t cross join
(select @v := -1, @rn := 0) params
order by VehicleId, timestamp desc
) t
where rn <= 3;
how do I query sql for a latest record date for each user
select t.username, t.date, t.value
from MyTable t
inner join (
select username, max(date) as MaxDate
from MyTable
group by username
) tm on t.username = tm.username and t.date = tm.MaxDate
Get first/last n records per group by
I think this is what you need:
SELECT tableA.idA, tableA.titleA, temp.idB, temp.textB
FROM tableA
INNER JOIN
(
SELECT tB1.idB, tB2.idA,
(
SELECT textB
FROM tableB
WHERE tableB.idB = tB1.idB
) as textB
FROM tableB as tB1
JOIN tableB as tB2
ON tB1.idA = tB2.idA AND tB1.idB >= tB2.idB
GROUP BY tB1.idA, tB1.idB
HAVING COUNT(*) <= 5
ORDER BY idA, idB
) as temp
ON tableA.idA = temp.idA
More info about this method here:
http://www.sql-ex.ru/help/select16.php
How can I fetch the last N rows, WITHOUT ordering the table
The general way to get the "last" row for each device_id looks like this.
select *
from Table1
inner join (select device_id, max(time) max_time
from Table1
group by device_id) T2
on Table1.device_id = T2.device_id
and Table1.time = T2.max_time;
Getting the "last" 200 device_id numbers without using an ORDER BY isn't really practical, but it's not clear why you might want to do that in the first place. If 200 is an arbitrary number, then you can get better performance by taking a subset of the table that's based on an arbitrary time instead.
select *
from Table1
inner join (select device_id, max(time) max_time
from Table1
where time > '2013-03-23 12:03'
group by device_id) T2
on Table1.device_id = T2.device_id
and Table1.time = T2.max_time;
Get at least last 2 rows from each row in a joined mysql 5.X tables
Since your desired output shared in your question only has columns from your Tracings table you need not use a join but only include your Tracing table for efficiency.
Schema (MySQL v5.5)
The following approach uses variables to determine the order and a where clause to limit by the ordered row number.
SET @row_num:=0;
SET @prev_grp:=NULL;
SELECT
t.idTrace,
t.idProcess
FROM (
SELECT
*,
@row_num:=(
CASE
WHEN @prev_grp<>idProcess THEN 1
ELSE @row_num+1
END
) as rn,
@prev_grp:=idProcess
FROM
Tracings
ORDER BY
idProcess,idTrace DESC
) t
WHERE rn <=2
ORDER BY t.idProcess,t.idTrace DESC;
or as one query
SELECT
t.idTrace,
t.idProcess
FROM (
SELECT
*,
@row_num:=(
CASE
WHEN @prev_grp<>idProcess THEN 1
ELSE @row_num+1
END
) as rn,
@prev_grp:=idProcess
FROM
Tracings
CROSS JOIN (SELECT @row_num:=0,@prev_grp:=NULL) as vars
ORDER BY
idProcess,idTrace DESC
) t
WHERE rn <=2
ORDER BY t.idProcess,t.idTrace DESC;
idTrace | idProcess |
---|---|
3 | 1 |
2 | 1 |
7 | 2 |
6 | 2 |
Related Topics
First Business Day of the Current Month - SQL Server
The Object Name Contains More Than the Maximum Number of Prefixes. the Maximum Is 3
Phpmyadmin - Total Record Count Varies
How to Use a Returned Column Value as a Table Name in an SQLite Query
Transpose Rows into Columns in SQL Server 2008 R2
Converting Delimited String to Multiple Values in MySQL
Oracle - Select Count on a Subquery
How to Extract Values from Column and Update Result in Another Column
What Are Some of Your Most Useful Database Standards
MySQL Datetime Group by 15 Mins
Performance of String Comparison VS Int Join in SQL
How to Prevent SQL Injection in Wordpress
Like Operation Returns No Rows on Nvarchar Column Filter If the Column Data Start with Numeric
SQL Server:Pivot with Custom Column Names
Odd Inner Join Syntax and Encapsulation