SQL Query to Return Only 1 Record Per Group Id

SQL query to return only 1 record per group ID

SELECT  t.*
FROM (
SELECT DISTINCT groupid
FROM mytable
) mo
CROSS APPLY
(
SELECT TOP 1 *
FROM mytable mi
WHERE mi.groupid = mo.groupid
ORDER BY
age DESC
) t

or this:

SELECT  *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY groupid ORDER BY age DESC) rn
FROM mytable
) x
WHERE x.rn = 1

This will return at most one record per group even in case of ties.

See this article in my blog for performance comparisons of both methods:

  • SQL Server: Selecting records holding group-wise maximum

sql query to return single record per group id in sequence

You can use row_number() in the order by:

select t.*
from t
order by row_number() over (partition by agency_id order by person),
agency_id;

The second key is so the rows are in the same order in each group.

Get top 1 row of each group

;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1

If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead

As for normalised or not, it depends if you want to:

  • maintain status in 2 places
  • preserve status history
  • ...

As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.

get records that have only 1 record per group

Use a correlated subquery

select * from tablename a
where not exists (select 1 from tablename b where a.empid=b.empid and a.date=b.date and type='Out')

OR

select empid, date,count(distinct type)
from tablename
group by empid,date
having count(distinct type)=1

How to select 1 row per id?

If you want one row per order_id, then window functions are not sufficient. They don't filter the data. You seem to want the most recent row. A typical method uses row_number():

select t.*
from (select t.*,
row_number() over (partition by order_id order by created_at desc) as seqnum
from t
) t
where seqnum = 1;

You can also use aggregation:

select order_id, max(ship_address), max(created_at)
from t
group by order_id;

However, the ship_address may not be from the most recent row and that is usually not desirable. You can tweak this using keep syntax:

select order_id,
max(ship_address) keep (dense_rank first order by created_at desc),
max(created_at)
from t
group by order_id;

However, this gets cumbersome for a lot of columns.

SQL - Only show one row per id

It sounds like you were close! Here I used a common table expression (CTE) so I'm working with one table. I then find the MAX ROWID for each ARTIKELNR, and then join them back to the CTE to get the rest of the information for the highest ROWID. This will return multiple rows for any ARTIKELNR that has duplicate max ROWIDs. But if ROWID is unique, that won't be a problem.

WITH cte AS (
SELECT dbo.PREISGRUPPEN.ARTIKELNR, KEK, PREIS, dbo.PREISGRUPPEN.ROWID
FROM dbo.PREISGRUPPEN, dbo.ARTIKEL
WHERE dbo.PREISGRUPPEN.GRUPPE = 5 AND dbo.PREISGRUPPEN.ARTIKELNR = dbo.ARTIKEL.ARTIKELNR
ORDER BY dbo.PREISGRUPPEN.ARTIKELNR, dbo.PREISGRUPPEN.ROWID
)

SELECT cte.*
FROM cte
INNER JOIN (
SELECT ARTIKELNR, MAX(ROWID) AS ROWID
FROM cte
GROUP BY ARTIKELNR
) AS A
ON cte.ARTIKELNR = A.ARTIKELNR AND cte.ROWID = A.ROWID

Select first row in each GROUP BY group?

On databases that support CTE and windowing functions:

WITH summary AS (
SELECT p.id,
p.customer,
p.total,
ROW_NUMBER() OVER(PARTITION BY p.customer
ORDER BY p.total DESC) AS rank
FROM PURCHASES p)
SELECT *
FROM summary
WHERE rank = 1

Supported by any database:

But you need to add logic to break ties:

  SELECT MIN(x.id),  -- change to MAX if you want the highest
x.customer,
x.total
FROM PURCHASES x
JOIN (SELECT p.customer,
MAX(total) AS max_total
FROM PURCHASES p
GROUP BY p.customer) y ON y.customer = x.customer
AND y.max_total = x.total
GROUP BY x.customer, x.total

SQL to return single row per group

You could use "group by" in your SQL request.

Let's say you have a table EMPLOYEES :

ID | NAME    | TYPE
1 | 'John' | 'full-time'
2 | 'Mike' | 'full-time'
3 | 'Alex' | 'part-time'
4 | 'Jerry' | 'part-time'

You can run:

Select * from employees group by type

This would return:

ID | NAME | TYPE
1 | John | full-time
3 | Alex | part-time

Note with this approach the returned Name and Id are not choosen or certain.
To have a specific Id (say the biggest) in the returned result, prefer the following:

SELECT *
FROM employees e
INNER JOIN
(
SELECT MAX(id) AS id, type
FROM employees
GROUP BY type
) e2
ON e.id = e2.id
AND e.type = e2.type;

How do you return a specific row per group by some criteria in BigQuery?

Consider below approach

select surname, array_agg(struct(firstname, age) order by age desc limit 1)[offset(0)].*
from your_table
group by surname

if applied to sample data in your question - output is

Sample Image

Using GROUP BY, select ID of record in each group that has lowest ID

MIN() and MAX() should return the same amount of rows. Changing the function should not change the number of rows returned in the query.

Is this query part of a larger query? From looking at the sample data provided, I would assume that this code is only a snippet from a larger action you are trying to do. Do you later try to join TypeID, ContentID or FolderID with the tables the IDs are referencing?

If yes, this error is likely being caused by another part of your query and not this select statement. If you are using joins or multi-level select statements, you can get different amount of results if the reference tables do not contain a record for all the foreign IDs.

Another suggestion, check to see if any of the values in your records are NULL. Although this should not affect the GROUP BY, I have sometime encountered strange behavior when dealing with NULL values.



Related Topics



Leave a reply



Submit