Sql Query to Find The Duplicate Records

Finding duplicate values in a SQL table

SELECT
name, email, COUNT(*)
FROM
users
GROUP BY
name, email
HAVING
COUNT(*) > 1

Simply group on both of the columns.

Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":

In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.

Support is not consistent:

  • Recent PostgreSQL supports it.
  • SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
  • MySQL is unpredictable and you need sql_mode=only_full_group_by:

    • GROUP BY lname ORDER BY showing wrong results;
    • Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
  • Oracle isn't mainstream enough (warning: humour, I don't know about Oracle).

How to find duplicate records in PostgreSQL

The basic idea will be using a nested query with count aggregation:

select * from yourTable ou
where (select count(*) from yourTable inr
where inr.sid = ou.sid) > 1

You can adjust the where clause in the inner query to narrow the search.


There is another good solution for that mentioned in the comments, (but not everyone reads them):

select Column1, Column2, count(*)
from yourTable
group by Column1, Column2
HAVING count(*) > 1

Or shorter:

SELECT (yourTable.*)::text, count(*)
FROM yourTable
GROUP BY yourTable.*
HAVING count(*) > 1

SQL query to find duplicate rows and return both IDs

select
orig.original_id,
t.id as duplicate_id,
orig.email
from t
inner join (select min(id) as original_id, email
from t
group by email
having count(*)>1) orig on orig.email = t.email
having t.id!=orig.original_id

By the subquery we can find all ids for emails with duplicates.

Then we join the subquery by email and for each one use minimal id as original

UPDATE: http://rextester.com/BLIHK20984 cloned @Tim Biegeleisen's answer

Select statement to find duplicates on certain fields

To get the list of fields for which there are multiple records, you can use..

select field1,field2,field3, count(*)
from table_name
group by field1,field2,field3
having count(*) > 1

Check this link for more information on how to delete the rows.

http://support.microsoft.com/kb/139444

There should be a criterion for deciding how you define "first rows" before you use the approach in the link above. Based on that you'll need to use an order by clause and a sub query if needed. If you can post some sample data, it would really help.

SQL query to find duplicates

select * 
from File
where hash in (select
hash
from File
group by hash
having count(*) > 1)

Find duplicate entries in a column

Using:

  SELECT t.ctn_no
FROM YOUR_TABLE t
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1

...will show you the ctn_no value(s) that have duplicates in your table. Adding criteria to the WHERE will allow you to further tune what duplicates there are:

  SELECT t.ctn_no
FROM YOUR_TABLE t
WHERE t.s_ind = 'Y'
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1

If you want to see the other column values associated with the duplicate, you'll want to use a self join:

SELECT x.*
FROM YOUR_TABLE x
JOIN (SELECT t.ctn_no
FROM YOUR_TABLE t
GROUP BY t.ctn_no
HAVING COUNT(t.ctn_no) > 1) y ON y.ctn_no = x.ctn_no

How to find duplicates in a table using Access SQL?

use HAVING COUNT(name) > 1 clause

SELECT * FROM Table1
WHERE [name] IN
(SELECT name, Count(name)
FROM Table1
GROUP BY name
HAVING COUNT(name)>1)

Find duplicate records in MySQL

The key is to rewrite this query so that it can be used as a subquery.

SELECT firstname, 
lastname,
list.address
FROM list
INNER JOIN (SELECT address
FROM list
GROUP BY address
HAVING COUNT(id) > 1) dup
ON list.address = dup.address;


Related Topics



Leave a reply



Submit