Finding Duplicate Rows in SQL Server

Finding duplicate values in a SQL table

SELECT
    name, email, COUNT(*)
FROM
    users
GROUP BY
    name, email
HAVING 
    COUNT(*) > 1

Simply group on both of the columns.

Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":

In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.

Support is not consistent:

Recent PostgreSQL supports it.
SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
MySQL is unpredictable and you need sql_mode=only_full_group_by:
- GROUP BY lname ORDER BY showing wrong results;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
Oracle isn't mainstream enough (warning: humour, I don't know about Oracle).

Finding duplicate rows in SQL Server

select o.orgName, oc.dupeCount, o.id
from organizations o
inner join (
    SELECT orgName, COUNT(*) AS dupeCount
    FROM organizations
    GROUP BY orgName
    HAVING COUNT(*) > 1
) oc on o.orgName = oc.orgName

Finding Duplicate Rows in SQL Server Based on Character Matching

You can group by LEFT(FirstName, 3) , for example:

    declare @t table (firstName nvarchar(20), lastname nvarchar(20))

    insert into @t
    values ('Robert', 'Williams'), ('Robbie', 'Williams'), ('NotRob', 'Williams'),  ('Steve', 'Other'), ('Steven', 'Other'), ('Someone', 'Else'), ('Roberto', 'Williams')

    select t1.* from @t t1
    cross apply (
            select
                LEFT(firstName, 3) as firstNameShort, lastname
            from
                @t t2
            where LEFT(t2.firstName, 3) = LEFT(t1.firstName, 3)
                and t2.lastname = t1.lastname
            group by
                lastname, LEFT(firstName, 3) 
            having 
                COUNT(*) > 1) t3
    order by t1.lastname, t1.firstName

Finding Duplicate Rows in SQL Server

Finding duplicate values in a SQL table

Finding duplicate rows in SQL Server

Finding Duplicate Rows in SQL Server Based on Character Matching

Related Topics

Leave a reply