How to (Or Can I) Select Distinct on Multiple Columns

How do I (or can I) SELECT DISTINCT on multiple columns?


is roughly equivalent to:

SELECT a,b,c FROM t GROUP BY a,b,c

It's a good idea to get used to the GROUP BY syntax, as it's more powerful.

For your query, I'd do it like this:

UPDATE sales
SET status='ACTIVE'
FROM sales S
SELECT saleprice, saledate
FROM sales
GROUP BY saleprice, saledate
) T
ON S.saleprice=T.saleprice AND s.saledate=T.saledate

Select distinct on multiple columns simultaneously, and keep one column in PostgreSQL

This should do the trick

SELECT A, B, max(C) AS max_of_C FROM tblA GROUP BY A, B

How to do distinct on multiple columns after join and then sort and select latest for each group?


SELECT DISTINCT ON (, p.prod_id)
pg.group_name, AS prod_name, v.version
FROM product_group pg
LEFT JOIN product p ON = p.group_id
LEFT JOIN version v ON v.prod_id = p.prod_id
ORDER BY, p.prod_id, v.version DESC;

screen capture of demo link below


Select distinct of multiple columns from prestodb

You can use group by on all required columns:

SELECT tid, open_dt 
FROM table_name
GROUP BY tid, open_dt

Oracle - Select distinct on multiple columns where count = 1

select min(Id), Manufacturer, Model, min(dateCreated)
from carsSold
group by Manufacturer, Model
having count(*) = 1 and min(dateCreated) = trunc(sysdate);

This is a pretty standard group by query. The having guarantees that we only get groups with a single row. The condition against dateCreated must use an aggregate but since there's only one row in the group then min() is really the same thing.

Paraphrasing: Return all groups where the combination of manufacturer and model is counted once and the earliest date of those is the current day (or any date of your choosing.) The id and created date values are recovered as dummy aggregates.

EDIT: It's pretty clear to me that you don't intend to run this query retrospectively and that you'll only be interested in using a date of current day. So I didn't feel the need to make this comment earlier. But if you did need to look back in time then it's quite trivial to add where dateCreated <= <some date> and substitute the same date in the having clause so that all later-created rows are not considered.

Edit 2: To simply get the earliest row for each combination you can use not exists. There are actually multiple ways to express this query but here is a simple one. It's really not even related to the query above.

select * from carsSold c
where not exists (
select 1 from carsSold c2
c2.Manufacturer = c.Manufacturer
and c2.Model = c.Model
and c2.dateCreated < c.dateCreated

Counting DISTINCT over multiple columns

If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.

Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.

I believe a distinct count of the computed column would be equivalent to your query.

Get unique row with distinct multiple column in oracle table

Thanks to @marmite-bomber following query worked for me,

SELECT comp_key, loc_id, org_id, max_id, 1, sysdate, sysdate
SELECT comp_key, max(id) max_id
WITH t1 AS (
SELECT, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT, t.loc_id, t.org_id, nvl(comp2,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT, t.loc_id, t.org_id, nvl(comp3,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL
SELECT, t1.loc_id loc_id, t1.org_id org_id, listagg(comp,',') within group (ORDER BY comp) AS comp_key
GROUP BY, t1.loc_id, t1.org_id
GROUP BY comp_key, loc_id, org_id
) t2, tags tg
WHERE = t2.max_id;

How do I select distinct count over multiple columns?

There are multiple options:

select count(*) from
(select distinct col1, col2, col3 FROM table) t

The other would be to combine the columns via a CONCAT:

select count(distinct col1 || col2 || col3) from table

The first option is the cleaner (and likely faster) one.

Related Topics

Leave a reply