delete duplicate record from same table in mysql
Following delete removes all duplicates, leaving you with the latest CustomerID
A note of warning though. I don't know your use case but it is perfectly possible to have two people with the exact same name (we even had the addres being the same at one time).
DELETE c1
FROM tblm_customer c1
, tblm_customer c2
WHERE c1.FirstName = c2.FirstName
AND c1.LastName = c2.LastName
AND c1.CustomerID < c2.CustomerID
Delete all Duplicate Rows except for One in MySQL?
Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.
NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also included AND n1.id <> n2.id
, it deleted every row in the table.
If you want to keep the row with the lowest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
If you want to keep the row with the highest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE
, please be advised that using INSERT
and DISTINCT
is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE
, it took more than 2 hours and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;
How to delete duplicates from one table, but keeping only one record?
I found out the exact reason of issue I faced finally.
I referenced the comment of @Malakiyasanjay.
you can find that from here How to keep only one row of a table, removing duplicate rows?
I tried like this: (and it worked for me as well but it took a lot of time to run the query for 30,000 rows)
delete from myTable
where id not in
(select min(id) as min from (select * from myTable) as x group by title)
The problem was I couldn't specify the 'myTable' table as a target table. so I used (select * from myTable) as x
and figured it out.
I am sorry I can't explain more detail about that because I am not familiar with mysql query. But you should note that:
MySql does not allow the direct use of the target table inside a subquery like the one you use with NOT IN, but you can overcome this limitation by enclosing the subquery inside another one.
(Please reference @forpas 's answer.)
But you have to notice this takes so long time... It might cause the time out error. I ran this query for table with about 600,000 rows but it didn't response for several days. So I conclude this idea is pretty fit to small database table.
I hope this is helpful for everyone! :)
Delete duplicate rows in mySQL in same table
MySQL provides you with the DELETE JOIN statement that allows you to remove duplicate rows quickly.
The following statement deletes duplicate rows and keeps the highest id:
DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE
t1.id < t2.id AND
t1.unique_col = t2.unique_col;
In case you want to delete duplicate rows and keep the lowest id, you can use the following statement:
DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE
t1.id > t2.id AND
t1.unique_col = t2.unique_col;
Remove duplicate rows from mysql table result
If you expect your query to return the number of duplicates then no it is not correct.
The condition t1.id < t2.id
will join every id
of t1
with all id
s from t2
that are greater resulting on more rows or less rows (in the case of only 2 duplicates) and rarely in the expected number.
See the demo.
If you want to see all the duplicates:
select * from consignment t
where t.service = 'CLRC'
and exists (
select 1 from consignment
where service = t.service and id <> t.id and hawb = t.hawb
)
See the demo.
If you want to delete the duplicates and keep only the one ones with the max id
for each hawb
then:
delete from consignment
where service='CLRC'
and id not in (
select id from (
select max(id) id from consignment
where service='CLRC'
group by hawb
) t
);
See the demo.
MySQL delete duplicate records but keep latest
Imagine your table test
contains the following data:
select id, email
from test;
ID EMAIL
---------------------- --------------------
1 aaa
2 bbb
3 ccc
4 bbb
5 ddd
6 eee
7 aaa
8 aaa
9 eee
So, we need to find all repeated emails and delete all of them, but the latest id.
In this case, aaa
, bbb
and eee
are repeated, so we want to delete IDs 1, 7, 2 and 6.
To accomplish this, first we need to find all the repeated emails:
select email
from test
group by email
having count(*) > 1;
EMAIL
--------------------
aaa
bbb
eee
Then, from this dataset, we need to find the latest id for each one of these repeated emails:
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email;
LASTID EMAIL
---------------------- --------------------
8 aaa
4 bbb
9 eee
Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is:
delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;
I don't have mySql installed on this machine right now, but should work
Update
The above delete works, but I found a more optimized version:
delete test
from test
inner join (
select max(id) as lastId, email
from test
group by email
having count(*) > 1) duplic on duplic.email = test.email
where test.id < duplic.lastId;
You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6:
select * from test;
+----+-------+
| id | email |
+----+-------+
| 3 | ccc |
| 4 | bbb |
| 5 | ddd |
| 8 | aaa |
| 9 | eee |
+----+-------+
Another version, is the delete provived by Rene Limon
delete from test
where id not in (
select max(id)
from test
group by email)
How to remove duplicate MySQL records (but only leave one)
You can use this to keep the row with the lowest id value
DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;
this an example link link 1
or you can change >
to <
for keep the highest id
DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;
this an example link link 2
Related Topics
Oracle as Keyword and Subqueries
How to Combine These Two SQL Statements
Handling Non Existent Values in SQL Query Expression for Ssrs Chart
How to Group by Each Day in Pl/Sql
T-SQL - Left Outer Joins - Filters in the Where Clause Versus the on Clause
Call Function for Specifying Columns in SQL Transform Query
Escaping Command Parameters Passed to Xp_Cmdshell to Dtexec
How to Create Sequence Using Starting Value from Query
Insert into Not Exists SQL Access
How to Extract Multiple Strings from Single Rows in SQL Server
Generic SQL That Both Access and Odbc/Oracle Can Understand
Problems with Createdb in Postgres
Select Non-Empty Columns Using SQL Server
Format Function Not Working in SQL Server 2008 R2
How to Programmatically Check If Row Is Deletable
Using Timestampdiff in a Derby Where Clause
Using Openxml in SQL Server 2008 Stored Proc - Insert Order Differs from Xml Document