Delete Duplicate Record from Same Table in MySQL

delete duplicate record from same table in mysql

Following delete removes all duplicates, leaving you with the latest CustomerID

A note of warning though. I don't know your use case but it is perfectly possible to have two people with the exact same name (we even had the addres being the same at one time).

DELETE  c1
FROM tblm_customer c1
, tblm_customer c2
WHERE c1.FirstName = c2.FirstName
AND c1.LastName = c2.LastName
AND c1.CustomerID < c2.CustomerID

Delete all Duplicate Rows except for One in MySQL?

Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

  1. If you want to keep the row with the lowest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
  2. If you want to keep the row with the highest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.


Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;

How to delete duplicates from one table, but keeping only one record?

I found out the exact reason of issue I faced finally.
I referenced the comment of @Malakiyasanjay.
you can find that from here How to keep only one row of a table, removing duplicate rows?

I tried like this: (and it worked for me as well but it took a lot of time to run the query for 30,000 rows)

delete from myTable
where id not in
(select min(id) as min from (select * from myTable) as x group by title)

The problem was I couldn't specify the 'myTable' table as a target table. so I used (select * from myTable) as x and figured it out.

I am sorry I can't explain more detail about that because I am not familiar with mysql query. But you should note that:

MySql does not allow the direct use of the target table inside a subquery like the one you use with NOT IN, but you can overcome this limitation by enclosing the subquery inside another one.
(Please reference @forpas 's answer.)

But you have to notice this takes so long time... It might cause the time out error. I ran this query for table with about 600,000 rows but it didn't response for several days. So I conclude this idea is pretty fit to small database table.

I hope this is helpful for everyone! :)

Delete duplicate rows in mySQL in same table

MySQL provides you with the DELETE JOIN statement that allows you to remove duplicate rows quickly.

The following statement deletes duplicate rows and keeps the highest id:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE
t1.id < t2.id AND
t1.unique_col = t2.unique_col;

In case you want to delete duplicate rows and keep the lowest id, you can use the following statement:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE
t1.id > t2.id AND
t1.unique_col = t2.unique_col;

Remove duplicate rows from mysql table result

If you expect your query to return the number of duplicates then no it is not correct.

The condition t1.id < t2.id will join every id of t1 with all ids from t2 that are greater resulting on more rows or less rows (in the case of only 2 duplicates) and rarely in the expected number.

See the demo.

If you want to see all the duplicates:

select * from consignment t
where t.service = 'CLRC'
and exists (
select 1 from consignment
where service = t.service and id <> t.id and hawb = t.hawb
)

See the demo.

If you want to delete the duplicates and keep only the one ones with the max id for each hawb then:

delete from consignment
where service='CLRC'
and id not in (
select id from (
select max(id) id from consignment
where service='CLRC'
group by hawb
) t
);

See the demo.

MySQL delete duplicate records but keep latest

Imagine your table test contains the following data:

  select id, email
from test;

ID EMAIL
---------------------- --------------------
1 aaa
2 bbb
3 ccc
4 bbb
5 ddd
6 eee
7 aaa
8 aaa
9 eee

So, we need to find all repeated emails and delete all of them, but the latest id.

In this case, aaa, bbb and eee are repeated, so we want to delete IDs 1, 7, 2 and 6.

To accomplish this, first we need to find all the repeated emails:

      select email 
from test
group by email
having count(*) > 1;

EMAIL
--------------------
aaa
bbb
eee

Then, from this dataset, we need to find the latest id for each one of these repeated emails:

  select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email;

LASTID EMAIL
---------------------- --------------------
8 aaa
4 bbb
9 eee

Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is:

delete test
from test
inner join (
select max(id) as lastId, email
from test
where email in (
select email
from test
group by email
having count(*) > 1
)
group by email
) duplic on duplic.email = test.email
where test.id < duplic.lastId;

I don't have mySql installed on this machine right now, but should work

Update

The above delete works, but I found a more optimized version:

 delete test
from test
inner join (
select max(id) as lastId, email
from test
group by email
having count(*) > 1) duplic on duplic.email = test.email
where test.id < duplic.lastId;

You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6:

select * from test;
+----+-------+
| id | email |
+----+-------+
| 3 | ccc |
| 4 | bbb |
| 5 | ddd |
| 8 | aaa |
| 9 | eee |
+----+-------+

Another version, is the delete provived by Rene Limon

delete from test
where id not in (
select max(id)
from test
group by email)

How to remove duplicate MySQL records (but only leave one)

You can use this to keep the row with the lowest id value

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;

this an example link link 1

or you can change > to < for keep the highest id

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;

this an example link link 2



Related Topics



Leave a reply



Submit