Select Top N Record from Each Group SQLite

select top n record from each group sqlite

You could use a correlated subquery:

select  *
from ResultView rv1
where SubjectId || '-' || StudentId || '-' || LevelId in
(
select SubjectId || '-' || StudentId || '-' || LevelId
from ResultView rv2
where SubjectID = rv1.SubjectID
order by
total desc
limit 2
)

This query constructs a single-column primary key by concatenating three columns. If you have a real primary key (like ResultViewID) you can substitute that for SubjectId || '-' || StudentId || '-' || LevelId.

Example at SQL Fiddle.

SQlite select top N rows from the table for each item

Starting from SQLite 3.25 you could use ROW_NUMBER:

WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Name
ORDER BY CAST(createdDate AS DATE) DESC) AS rn
FROM tab
)
SELECT *
FROM cte
WHERE rn <= 2;

SQLite: return only top 2 results within each group

This can be achieved by embedding your existing query inside a CTE using WITH, then using it in a WHERE ... IN subquery. The subquery selects LIMIT 2 product IDs from the CTE that match category_name, city, and gender, ordered by product count.

WITH order_groups AS (
SELECT
category_name,
city, gender,
product_id,
COUNT(product_id) AS product_count
FROM orders OO
LEFT JOIN products ON product_id = products.id
LEFT JOIN clients ON client_id = clients.id
GROUP BY product_id, category_name, city, gender
ORDER BY category_name, city, gender, COUNT(product_id) DESC
)
SELECT * FROM order_groups OG_outer
WHERE OG_outer.product_id IN (
SELECT product_id
FROM order_groups OG_inner
WHERE
OG_outer.category_name = OG_inner.category_name AND
OG_outer.city = OG_inner.city AND
OG_outer.gender = OG_inner.gender
ORDER BY OG_inner.product_count DESC LIMIT 2
)
ORDER BY category_name, city, gender, product_count DESC

This outputs the following rows as requested:

Furniture|London|Female|4|2
Furniture|London|Female|3|1
Furniture|London|Male|4|3
Furniture|London|Male|3|2
Furniture|NY|Female|5|2
Furniture|NY|Female|4|1
Furniture|NY|Male|3|3
Furniture|NY|Male|4|1
Kitchen|London|Female|9|2
Kitchen|London|Female|8|1
Kitchen|London|Male|9|3
Kitchen|London|Male|8|1
Kitchen|NY|Female|9|4
Kitchen|NY|Female|10|2
Kitchen|NY|Male|1|1
Kitchen|NY|Male|8|1
Sport|London|Female|7|2
Sport|London|Female|2|1
Sport|London|Male|7|2
Sport|London|Male|6|1
Sport|NY|Female|2|2
Sport|NY|Female|6|2
Sport|NY|Male|7|3

SQLite take N rows per each group

The following almost does what you want:

select t.*
from t
where t.rowid in (select t2.rowid
from t t2
where t2.b = t.b
order by random()
limit 3
);

Alas, the subquery will be run for every row, so this is only approximate because the random number generator changes values on each execution.

One solution is to use a temporary table to store a random number for each row, which can then be used for sorting. Unfortunately, a CTE doesn't seem to do the trick, because these are re-evaluated on each reference.

After some thought, I think a temporary table might be the only solution:

drop table if exists tempt;

create temporary table tempt as
select t.*, random() as rand
from t;

select t.*
from tempt t
where t.rowid in (select t2.rowid
from tempt t2
where t2.b = t.b
order by rand
limit 3
);

How to select the first N rows of each group?

You can do the counting using a correlated subquery:

SELECT b.BookId, a.AuthorId, a.AuthorName, b.Title
FROM Author a join
Book b
on a.AuthorId = b.AuthorId
where (select count(*)
from book b2
where b2.bookId <= b.BookId and b2.AuthorId = b.AuthorId
) <= 2;

For a small database this should be fine. If you create a composite index on Book(AuthorId, BookId) then that will help the query.

row with max value per group - SQLite

Join against that result to get the complete table records

SELECT t1.*
FROM your_table t1
JOIN
(
SELECT name, Max(population) as max_population
FROM your_table
WHERE name IN ('a', 'b', 'c')
GROUP BY name
) t2 ON t1.name = t2.name
and t1.population = t2.max_population

Get top 1 row of each group

;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1

If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead

As for normalised or not, it depends if you want to:

  • maintain status in 2 places
  • preserve status history
  • ...

As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.



Related Topics



Leave a reply



Submit