Find duplicate records in MySQL
The key is to rewrite this query so that it can be used as a subquery.
SELECT firstname,
lastname,
list.address
FROM list
INNER JOIN (SELECT address
FROM list
GROUP BY address
HAVING COUNT(id) > 1) dup
ON list.address = dup.address;
Finding duplicate values in MySQL
Do a SELECT
with a GROUP BY
clause. Let's say name is the column you want to find duplicates in:
SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1;
This will return a result with the name value in the first column, and a count of how many times that value appears in the second.
How to show rows with duplicate values first in SQL without grouping
From the comment I think your MySQL
version doesn't support windows function so you could use:
SELECT
t.pid,
t.product,
( SELECT COUNT('pid')
FROM test_tbl ct
WHERE ct.pid = t.pid
) as counter
FROM
test_tbl t
order by counter desc,pid asc ;
Result:
pid product counter
1 Red 2
1 Black 2
3 Green 2
3 Magenta 2
2 Blue 1
4 Violet 1
Demo
MySQL, Getting True Duplicate Records
It is easy enough to make your query work for all columns, assuming none have NULL
values:
SELECT t.*
FROM `table` t
WHERE (`column_1`, `column_2`, `column_3`) IN (
SELECT `column_1`, `column_2`, `column_3`
FROM `table` t2
GROUP BY `column_1`, `column_2`, `column_3`
HAVING COUNT(id) > 1
);
If you have NULL
values, then you want NULL
safe comparisons:
SELECT t.*
FROM `table` t JOIN
(SELECT `column_1`, `column_2`, `column_3`
FROM `table` t2
GROUP BY `column_1`, `column_2`, `column_3`
HAVING COUNT(id) > 1
) tt
ON (NOT tt.column_1 <=> t.column_1) AND
(NOT tt.column_2 <=> t.column_2) AND
(NOT tt.column_3 <=> t.column_3);
Of course, this is even further from the goal of simplicity.
Why you would need to see each duplicate is curious. Why not just do:
SELECT `column_1`, `column_2`, `column_3`, COUNT(*)
FROM `table` t2
GROUP BY `column_1`, `column_2`, `column_3`
HAVING COUNT(id) > 1
In both these cases, though, you need to list out all the columns (at least once). I don't think there is a way to do this in MySQL otherwise. Some databases allow you to create a JSON object or XML object for an entire row -- making this possible without listing all the columns. I cannot think of anything similar in MySQL.
MySQL use DELETE FROM to remove duplicates rows
First, this is a very bad way of implementing this code. But I guess you get what you pay for.
Second, simply run the query as a select
:
SELECT p1.*, p2.*
FROM Person p1 JOIN
Person p2
ON p1.Email = p2.Email AND p1.Id > p2.Id;
(Note that I've rewritten the logic as a JOIN
. You should always use proper, explicit, standard, readable JOIN
syntax, but the two methods are functionally equivalent.)
On your second example, the results of this query are:
table1 email table1 id table2 id
john@example.com. 2 1
john@example.com. 3 1
john@example.com. 3 2
What is notable is that id = 1 is never in the second column -- and that is the column that determines which ids are deleted. In other words, all but the smallest id for each email get deleted because there is a smaller id
.
This also hints at why this is a really bad solution. MySQL has to deal with two rows for id = 3
. Perhaps it attempts to delete both. Perhaps it has to just deal with extra data. Either way, there is extra work. And the more rows with the same email in the data the more extra duplicates are created.
An alternative method, such as:
delete p
from person p join
(select email, min(id) as min_id
from person p2
group by email
) p2
on p.email = p2.email and p.id > p2.min_id;
Does not have this problem and, in my opinion, the intent is clearer.
MySQL Query return duplicate rows (No duplicates in table)
You forgot a condition on the join with the 'versions' table, so your query might actually be returning one row per row in 'versions'.
Related Topics
On Delete Cascade for Self-Referencing Table
Incorrect Parameter Count in the Call to Native Function 'Datediff'
MySQL - Concatenate Two Tables
Amazon Redshift - Lateral Column Alias Reference
Strip Non-Numeric Characters from a String
Oracle Show All Employees with Greater Than Average Salary of Their Department
How to Import an Excel Spreadsheet into SQL Server 2008R2 Database
How to Extract Values from Column and Update Result in Another Column
Calculate Missing Date Ranges and Overlapping Date Ranges Between Two Dates
Update a Single Row with T-Sql
What Happens with Duplicates When Inserting Multiple Rows
Row_Number Simulation in SQL Server 2000
SQL Server Foreign Key to Multiple Tables