Counting Distinct Over Multiple Columns

Counting DISTINCT over multiple columns

If you are trying to improve performance, you could try creating a persisted computed column on either a hash or concatenated value of the two columns.

Once it is persisted, provided the column is deterministic and you are using "sane" database settings, it can be indexed and / or statistics can be created on it.

I believe a distinct count of the computed column would be equivalent to your query.

How to do count(distinct) for multiple columns

[TL;DR] Just use a sub-query.


If you are trying to use concatenation then you need to ensure that you delimit the terms with a string that is never going to appear in the values otherwise you will find non-distinct terms grouped together.

For example: if you have a two numeric column then using COUNT(DISTINCT col1 || col2) will group together 1||23 and 12||3 and count them as one group.

You could use COUNT(DISTINCT col1 || '-' || col2) but if the columns are string values and you have 'ab-'||'-'||'c' and 'ab'||'-'||'-c' then, once again, they would be identical once concatenated.

The simplest method is to use a sub-query.

If you can't do that then you can combine columns via string-concatenation but you need to analyse the contents of the column and pick a delimiter that does not appear in your strings otherwise your results might be erroneous. Even better is to ensure that the delimiter character will never be in the sub-string with check constraints.

ALTER TABLE mytable ADD CONSTRAINT mytable__col1__chk CHECK (col1 NOT LIKE '%¬%');
ALTER TABLE mytable ADD CONSTRAINT mytable__col2__chk CHECK (col2 NOT LIKE '%¬%');

Then:

SELECT COUNT(DISTINCT col1 || '¬' || col2)
FROM mytable;

How do I select distinct count over multiple columns?

There are multiple options:

select count(*) from
(select distinct col1, col2, col3 FROM table) t

The other would be to combine the columns via a CONCAT:

select count(distinct col1 || col2 || col3) from table

The first option is the cleaner (and likely faster) one.

SELECT COUNT(DISTINCT... ) error on multiple columns?

COUNT() in SQL Server accepts the following syntax

COUNT(*)
COUNT(colName)
COUNT(DISTINCT colName)

You can have a subquery which returns unique set of make and model that you can count with.

SELECT  COUNT(*)
FROM
(
SELECT DISTINCT make, model
FROM VehicleModelYear
) a

The "a" at the end is not a typo. It's an alias without which SQL will give an error ERROR 1248 (42000): Every derived table must have its own alias.

How to get distinct count over multiple columns in Hive SQL?

One possible option would be

WITH sample AS (
SELECT 'A' Column1, 'B' Column2, 'C' Column3 UNION ALL
SELECT 'A', 'A', 'B' UNION ALL
SELECT 'A', 'A', NULL UNION ALL
SELECT '', 'A', NULL
)
SELECT Column1, Column2, Column3, COUNT(DISTINCT NULLIF(TRIM(c), '')) unique_count
FROM (SELECT *, ROW_NUMBER() OVER () rn FROM sample) t LATERAL VIEW EXPLODE(ARRAY(Column1, Column2, Column3)) tf AS c
GROUP BY Column1, Column2, Column3, rn;
output
+---------+---------+---------+--------------+
| column1 | column2 | column3 | unique_count |
+---------+---------+---------+--------------+
| | A | NULL | 1 |
| A | A | NULL | 1 |
| A | A | B | 2 |
| A | B | C | 3 |
+---------+---------+---------+--------------+

QuickSight - count distinct of two columns

There are different ways to do it. You can simply put the center field in a field well and use the Count distinct aggregation as shown below

Count distinct

Alternatively, you can create a calculated field for this count with the following definition:

distinct_count({center})

Retrieving the distinct count across multiple columns

Its just group by and count them :

select Make ,Model, Year, COUNT(*)
from your table
group by Make ,Model, Year

How to get distinct count over multiple columns in SQL?

You want to use cross apply for this one.

select  *
from t cross apply
(select count(distinct cnt) as unique_count
from (values(Column1),(Column2),(Column3)) t(cnt)) t2


Leave a reply



Submit