How to (Or Can I) Select Distinct on Multiple Columns

How do I (or can I) SELECT DISTINCT on multiple columns?

SELECT DISTINCT a,b,c FROM t

is roughly equivalent to:

SELECT a,b,c FROM t GROUP BY a,b,c

It's a good idea to get used to the GROUP BY syntax, as it's more powerful.

For your query, I'd do it like this:

UPDATE sales
SET status='ACTIVE'
WHERE id IN
(
SELECT id
FROM sales S
INNER JOIN
(
SELECT saleprice, saledate
FROM sales
GROUP BY saleprice, saledate
HAVING COUNT(*) = 1
) T
ON S.saleprice=T.saleprice AND s.saledate=T.saledate
)

Select distinct on multiple columns simultaneously, and keep one column in PostgreSQL

This should do the trick

CREATE TABLE tblB AS (
SELECT A, B, max(C) AS max_of_C FROM tblA GROUP BY A, B
)

How to do distinct on multiple columns after join and then sort and select latest for each group?

Use DISTINCT ON:

SELECT DISTINCT ON (pg.id, p.prod_id)
pg.group_name, p.name AS prod_name, v.version
FROM product_group pg
LEFT JOIN product p ON pg.id = p.group_id
LEFT JOIN version v ON v.prod_id = p.prod_id
ORDER BY pg.id, p.prod_id, v.version DESC;

screen capture of demo link below

Demo

Select distinct of multiple columns from prestodb

You can use group by on all required columns:

SELECT tid, open_dt 
FROM table_name
GROUP BY tid, open_dt

Oracle - Select distinct on multiple columns where count = 1

select min(Id), Manufacturer, Model, min(dateCreated)
from carsSold
group by Manufacturer, Model
having count(*) = 1 and min(dateCreated) = trunc(sysdate);

This is a pretty standard group by query. The having guarantees that we only get groups with a single row. The condition against dateCreated must use an aggregate but since there's only one row in the group then min() is really the same thing.

Paraphrasing: Return all groups where the combination of manufacturer and model is counted once and the earliest date of those is the current day (or any date of your choosing.) The id and created date values are recovered as dummy aggregates.

EDIT: It's pretty clear to me that you don't intend to run this query retrospectively and that you'll only be interested in using a date of current day. So I didn't feel the need to make this comment earlier. But if you did need to look back in time then it's quite trivial to add where dateCreated <= <some date> and substitute the same date in the having clause so that all later-created rows are not considered.

Edit 2: To simply get the earliest row for each combination you can use not exists. There are actually multiple ways to express this query but here is a simple one. It's really not even related to the query above.

select * from carsSold c
where not exists (
select 1 from carsSold c2
where
c2.Manufacturer = c.Manufacturer
and c2.Model = c.Model
and c2.dateCreated < c.dateCreated
)

Counting DISTINCT over multiple columns

If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.

Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.

I believe a distinct count of the computed column would be equivalent to your query.

Get unique row with distinct multiple column in oracle table

Thanks to @marmite-bomber following query worked for me,

SELECT comp_key, loc_id, org_id, max_id, 1, sysdate, sysdate
FROM (
SELECT comp_key, max(id) max_id
FROM (
WITH t1 AS (
SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp2,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp3,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL UNION
SELECT t.id, t.loc_id, t.org_id, nvl(comp1,0) comp FROM tags t WHERE t.paper_id = 1 AND t.paper_id IS NOT NULL
)
SELECT t1.id, t1.loc_id loc_id, t1.org_id org_id, listagg(comp,',') within group (ORDER BY comp) AS comp_key
FROM t1
GROUP BY t1.id, t1.loc_id, t1.org_id
)
GROUP BY comp_key, loc_id, org_id
) t2, tags tg
WHERE tg.id = t2.max_id;

How do I select distinct count over multiple columns?

There are multiple options:

select count(*) from
(select distinct col1, col2, col3 FROM table) t

The other would be to combine the columns via a CONCAT:

select count(distinct col1 || col2 || col3) from table

The first option is the cleaner (and likely faster) one.



Related Topics



Leave a reply



Submit