Finding duplicate values in a SQL table
SELECT
name, email, COUNT(*)
FROM
users
GROUP BY
name, email
HAVING
COUNT(*) > 1
Simply group on both of the columns.
Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
Support is not consistent:
- Recent PostgreSQL supports it.
- SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
- MySQL is unpredictable and you need
sql_mode=only_full_group_by
:- GROUP BY lname ORDER BY showing wrong results;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
- Oracle isn't mainstream enough (warning: humour, I don't know about Oracle).
How do I find duplicate entries in a database table?
A nested query can do the job.
SELECT author_last_name, dewey_number, NumOccurrences
FROM author INNER JOIN
( SELECT author_id, dewey_number, COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY author_id, dewey_number
HAVING ( COUNT(dewey_number) > 1 ) ) AS duplicates
ON author.id = duplicates.author_id
(I don't know if this is the fastest way to achieve what you want.)
Update: Here is my data
SELECT * FROM author;
id | author_last_name
----+------------------
1 | Fowler
2 | Knuth
3 | Lang
SELECT * FROM book;
id | author_id | dewey_number | title
----+-----------+--------------+------------------------
1 | 1 | 600 | Refactoring
2 | 1 | 600 | Refactoring
3 | 1 | 600 | Analysis Patterns
4 | 2 | 600 | TAOCP vol. 1
5 | 2 | 600 | TAOCP vol. 1
6 | 2 | 600 | TAOCP vol. 2
7 | 3 | 500 | Algebra
8 | 3 | 500 | Undergraduate Analysis
9 | 1 | 600 | Refactoring
10 | 2 | 500 | Concrete Mathematics
11 | 2 | 500 | Concrete Mathematics
12 | 2 | 500 | Concrete Mathematics
And here is the result of the above query:
author_last_name | dewey_number | numoccurrences
------------------+--------------+----------------
Fowler | 600 | 4
Knuth | 600 | 3
Knuth | 500 | 3
Lang | 500 | 2
How to find duplicate records in PostgreSQL
The basic idea will be using a nested query with count aggregation:
select * from yourTable ou
where (select count(*) from yourTable inr
where inr.sid = ou.sid) > 1
You can adjust the where clause in the inner query to narrow the search.
There is another good solution for that mentioned in the comments, (but not everyone reads them):
select Column1, Column2, count(*)
from yourTable
group by Column1, Column2
HAVING count(*) > 1
Or shorter:
SELECT (yourTable.*)::text, count(*)
FROM yourTable
GROUP BY yourTable.*
HAVING count(*) > 1
Finding duplicate values in MySQL
Do a SELECT
with a GROUP BY
clause. Let's say name is the column you want to find duplicates in:
SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1;
This will return a result with the name value in the first column, and a count of how many times that value appears in the second.
Find duplicate records in MySQL
The key is to rewrite this query so that it can be used as a subquery.
SELECT firstname,
lastname,
list.address
FROM list
INNER JOIN (SELECT address
FROM list
GROUP BY address
HAVING COUNT(id) > 1) dup
ON list.address = dup.address;
How do I find duplicate values in a table in Oracle?
Aggregate the column by COUNT, then use a HAVING clause to find values that appear more than once.
SELECT column_name, COUNT(column_name)
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) > 1;
Related Topics
SQL Missing Right Parenthesis on Order by Statement
How to Release Possible Postgres Row Locks
Sql- Ignore Case While Searching for a String
How to Delete Duplicates from a Database Table Based on a Certain Field
Fetch a Record on Maximum Date of Every Month
In Ms SQL Server, How to "Atomically" Increment a Column Being Used as a Counter
A Procedure to Reverse a String in Pl/Sql
Find N Nearest Neighbors for Given Point Using Postgis
Select Something That Has More/Less Than X Character
Select Top Distinct Results Ordered by Frequency
Should I Set Max Pool Size in Database Connection String? What Happens If I Don'T
How to Add Multiple "Not Like '%%' in the Where Clause of SQLite3
How to Use Merge on Linked Servers
Ansi_Nulls and Quoted_Identifier Killed Things. What Are They For
Multiple Counts Within a Single SQL Query
Odata Case In-Sensitive Filtering in Web API
How to Parse/Tokenize an SQL Statement in Node.Js
SQL Server 2008 Open Master Key Error Upon Physical Server Change Over