How to Select Every Row Where Column Value Is Not Distinct

How to Select Every Row Where Column Value is NOT Distinct

This is significantly faster than the EXISTS way:

SELECT [EmailAddress], [CustomerName] FROM [Customers] WHERE [EmailAddress] IN
(SELECT [EmailAddress] FROM [Customers] GROUP BY [EmailAddress] HAVING COUNT(*) > 1)

How to select non unique rows

Try this:

SELECT T1.idA, T1.infos
FROM XXX T1
JOIN
(
SELECT idA
FROM XXX
GROUP BY idA
HAVING COUNT(*) >= 2
) T2
ON T1.idA = T2.idA

The result for the data you posted:


idaA infos
201 1899
201 1959

How to select rows with non-unique values in a specific column when grouped by other columns?

You can use exists:

select t.*
from tbl t
where exists (select 1
from tbl t2
where t2.grp = t.grp and t2.oid = t.oid and
t2.id <> t.id
);

You can also use window functions -- although this may be less efficient:

select t.*
from (select t.*, count(*) over (partition by grp, oid) as cnt
from tbl t
) t
where cnt >= 2;

How to select non-distinct rows with a distinct on multiple columns


select COLUMN1,COLUMN2,COLUMN3 
from TABLE_NAME
group by COLUMN1,COLUMN2,COLUMN3
having COUNT(*) > 1

SELECT DISTINCT doesn't appear to work with big query

Consider below approach

SELECT AS VALUE ANY_VALUE(t) FROM (
SELECT ID, BRAND, TITLE, SHORT_TITLE, PRICE FROM read_table
) t
GROUP BY ID

SQL: Selecting rows from non unique column values once partitioned by another column

Maybe something like this does the trick

Oracle Option

I include this oracle version because it enables you to understand better what are you querying.

select tv_id, value  
from dataTable
where (tv_id, value) in (
select tv_id, value
from dataTable
group by tv_id, value
having count(1) > 1
)

SQL

But this is a standard sql version that will work with almost any database engine

select tv_id, value  
from dataTable d1
join (
select tv_id, value
from dataTable
group by tv_id, value
having count(1) > 1
) d2
on d1.tv_id=d2.tv_id
and d1.value=d2.value

You need to query the same table twice because the group by makes a distinct in your data, so you won't retrieve duplicated rows as you show in your expected output.

SQL select rows, where column value is unique (only appears once)

Use correlated subquery

DEMO

select * from tablename a
where not exists (select 1 from tablename b where a.name=b.name having count(*)>1)

OUTPUT:

id  name
2 Chad
4 Tim

Select Rows with unique column value in SQL

There are several ways to do this. Here's one (with standard SQL):

WITH xrows AS (
SELECT tbl.*
, COUNT(*) OVER (PARTITION BY time_period) AS n
FROM tbl
)
SELECT *
FROM xrows
WHERE n = 1
ORDER BY time_period
;

and with your SQL as a starting point:

WITH your_sql AS (
SELECT time_period, name, value
, COUNT(*) OVER (PARTITION BY time_period) AS n
FROM TBL.TBLValues,
UNNEST(nested_name) as unnested_name
WHERE time_period > '2021-07-01 00:00:00'
AND name != 'None'
)
SELECT *
FROM your_sql
WHERE n = 1
ORDER BY time_period
;

and now with the given available data:

WITH your_sql (row, time_period, name, value) AS (
SELECT 1, '2021-07-01T00:00:00', 'Name1', 100 UNION ALL
SELECT 2, '2021-07-01T00:00:00', 'Name2', 105 UNION ALL
SELECT 3, '2021-07-01T00:05:00', 'Name1', 120 UNION ALL
SELECT 4, '2021-07-01T00:10:00', 'Name3', 500 UNION ALL
SELECT 5, '2021-07-01T00:15:00', 'Name1', 110 UNION ALL
SELECT 6, '2021-07-01T00:15:00', 'Name3', 450
)
, xrows AS (
SELECT t.*
, COUNT(*) OVER (PARTITION BY time_period) AS n
FROM your_sql AS t
)
SELECT * FROM xrows WHERE n = 1
ORDER BY time_period
;

Result:

+-----+---------------------+-------+-------+---+
| row | time_period | name | value | n |
+-----+---------------------+-------+-------+---+
| 3 | 2021-07-01T00:05:00 | Name1 | 120 | 1 |
| 4 | 2021-07-01T00:10:00 | Name3 | 500 | 1 |
+-----+---------------------+-------+-------+---+

Here's the updated solution for the new requirement. I've added a duplicate row for row=3 (row=7), but only one of the rows will be shown. This case would have been removed in the previous COUNT logic:

WITH your_sql (row, time_period, name, value) AS (
SELECT 1, '2021-07-01T00:00:00', 'Name1', 100 UNION ALL
SELECT 2, '2021-07-01T00:00:00', 'Name2', 105 UNION ALL
SELECT 3, '2021-07-01T00:05:00', 'Name1', 120 UNION ALL
SELECT 7, '2021-07-01T00:05:00', 'Name1', 120 UNION ALL
SELECT 4, '2021-07-01T00:10:00', 'Name3', 500 UNION ALL
SELECT 5, '2021-07-01T00:15:00', 'Name1', 110 UNION ALL
SELECT 6, '2021-07-01T00:15:00', 'Name3', 450
)
, xrows0 AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY time_period ORDER BY name, value, row) AS n1
, RANK() OVER (PARTITION BY time_period ORDER BY name, value ) AS n2
FROM your_sql AS t
)
, xrows AS (
SELECT t.*
, MAX(n2) OVER (PARTITION BY time_period) AS m2
FROM xrows0 AS t
)
SELECT *
FROM xrows
WHERE m2 = 1
AND n1 = 1
ORDER BY time_period
;

Result:
+-----+---------------------+-------+-------+----+----+------+
| row | time_period | name | value | n1 | n2 | m2 |
+-----+---------------------+-------+-------+----+----+------+
| 3 | 2021-07-01T00:05:00 | Name1 | 120 | 1 | 1 | 1 |
| 4 | 2021-07-01T00:10:00 | Name3 | 500 | 1 | 1 | 1 |
+-----+---------------------+-------+-------+----+----+------+

and the new requirement, where only the name needs to be the same over that time_period, and with your new data row:

WITH your_sql (row, time_period, name, value) AS (
SELECT 1, '2021-07-01T00:00:00', 'Name1', 100 UNION ALL
SELECT 2, '2021-07-01T00:00:00', 'Name2', 105 UNION ALL
SELECT 3, '2021-07-01T00:05:00', 'Name1', 120 UNION ALL
SELECT 8, '2021-07-01T00:05:00', 'Name1', 120 UNION ALL
SELECT 4, '2021-07-01T00:10:00', 'Name3', 500 UNION ALL
SELECT 5, '2021-07-01T00:15:00', 'Name1', 110 UNION ALL
SELECT 6, '2021-07-01T00:15:00', 'Name3', 450 UNION ALL
SELECT 7, '2021-07-01T00:20:00', 'Name1', 1000
)
, xrows0 AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY time_period ORDER BY name, row ) AS n1
, RANK() OVER (PARTITION BY time_period ORDER BY name, value) AS n2
FROM your_sql AS t
)
, xrows AS (
SELECT t.*
, MAX(n2) OVER (PARTITION BY time_period) AS m2
FROM xrows0 AS t
)
SELECT *
FROM xrows
WHERE m2 = 1
AND n1 = 1
ORDER BY time_period
;

+-----+---------------------+-------+-------+----+----+------+
| row | time_period | name | value | n1 | n2 | m2 |
+-----+---------------------+-------+-------+----+----+------+
| 3 | 2021-07-01T00:05:00 | Name1 | 120 | 1 | 1 | 1 |
| 4 | 2021-07-01T00:10:00 | Name3 | 500 | 1 | 1 | 1 |
| 7 | 2021-07-01T00:20:00 | Name1 | 1000 | 1 | 1 | 1 |
+-----+---------------------+-------+-------+----+----+------+


Related Topics



Leave a reply



Submit