Delete Duplicate Record from Same Table in MySQL

delete duplicate record from same table in mysql

Following delete removes all duplicates, leaving you with the latest CustomerID

A note of warning though. I don't know your use case but it is perfectly possible to have two people with the exact same name (we even had the addres being the same at one time).

DELETE  c1
FROM    tblm_customer c1
        , tblm_customer c2
WHERE   c1.FirstName = c2.FirstName 
        AND c1.LastName = c2.LastName 
        AND c1.CustomerID < c2.CustomerID

Delete all Duplicate Rows except for One in MySQL?

Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

If you want to keep the row with the lowest id value:

DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name

If you want to keep the row with the highest id value:

DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.

Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
    SELECT DISTINCT cellId,attributeId,entityRowId,value
    FROM tableName;

How to delete duplicates from one table, but keeping only one record?

I found out the exact reason of issue I faced finally.
I referenced the comment of @Malakiyasanjay.
you can find that from here How to keep only one row of a table, removing duplicate rows?

I tried like this: (and it worked for me as well but it took a lot of time to run the query for 30,000 rows)

delete from myTable
where id not in 
(select min(id) as min from (select * from myTable) as x group by title)

The problem was I couldn't specify the 'myTable' table as a target table. so I used (select * from myTable) as x and figured it out.

I am sorry I can't explain more detail about that because I am not familiar with mysql query. But you should note that:

MySql does not allow the direct use of the target table inside a subquery like the one you use with NOT IN, but you can overcome this limitation by enclosing the subquery inside another one.
(Please reference @forpas 's answer.)

But you have to notice this takes so long time... It might cause the time out error. I ran this query for table with about 600,000 rows but it didn't response for several days. So I conclude this idea is pretty fit to small database table.

I hope this is helpful for everyone! :)

Delete duplicate rows in mySQL in same table

MySQL provides you with the DELETE JOIN statement that allows you to remove duplicate rows quickly.

The following statement deletes duplicate rows and keeps the highest id:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2 
WHERE 
    t1.id < t2.id AND 
    t1.unique_col = t2.unique_col;

In case you want to delete duplicate rows and keep the lowest id, you can use the following statement:

DELETE t1 FROM table_name t1
INNER JOIN table_name t2 
WHERE
    t1.id > t2.id AND 
    t1.unique_col = t2.unique_col;

Remove duplicate rows from mysql table result

If you expect your query to return the number of duplicates then no it is not correct.

The condition t1.id < t2.id will join every id of t1 with all ids from t2 that are greater resulting on more rows or less rows (in the case of only 2 duplicates) and rarely in the expected number.

See the demo.

If you want to see all the duplicates:

select * from consignment t
where t.service = 'CLRC' 
and exists (
  select 1 from consignment
  where service = t.service and id <> t.id and hawb = t.hawb 
)

See the demo.

If you want to delete the duplicates and keep only the one ones with the max id for each hawb then:

delete from consignment
where service='CLRC'
and id not in (
  select id from (
    select max(id) id from consignment
    where service='CLRC' 
    group by hawb
  ) t  
);

See the demo.

MySQL delete duplicate records but keep latest

Imagine your table test contains the following data:

  select id, email
    from test;

ID                     EMAIL                
---------------------- -------------------- 
1                      aaa                  
2                      bbb                  
3                      ccc                  
4                      bbb                  
5                      ddd                  
6                      eee                  
7                      aaa                  
8                      aaa                  
9                      eee

So, we need to find all repeated emails and delete all of them, but the latest id.

In this case, aaa, bbb and eee are repeated, so we want to delete IDs 1, 7, 2 and 6.

To accomplish this, first we need to find all the repeated emails:

      select email 
        from test
       group by email
      having count(*) > 1;

EMAIL                
-------------------- 
aaa                  
bbb                  
eee

Then, from this dataset, we need to find the latest id for each one of these repeated emails:

  select max(id) as lastId, email
    from test
   where email in (
              select email 
                from test
               group by email
              having count(*) > 1
       )
   group by email;

LASTID                 EMAIL                
---------------------- -------------------- 
8                      aaa                  
4                      bbb                  
9                      eee

Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is:

delete test
  from test
 inner join (
  select max(id) as lastId, email
    from test
   where email in (
              select email 
                from test
               group by email
              having count(*) > 1
       )
   group by email
) duplic on duplic.email = test.email
 where test.id < duplic.lastId;

I don't have mySql installed on this machine right now, but should work

Update

The above delete works, but I found a more optimized version:

 delete test
   from test
  inner join (
     select max(id) as lastId, email
       from test
      group by email
     having count(*) > 1) duplic on duplic.email = test.email
  where test.id < duplic.lastId;

You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6:

select * from test;
+----+-------+
| id | email |
+----+-------+
|  3 | ccc   |
|  4 | bbb   |
|  5 | ddd   |
|  8 | aaa   |
|  9 | eee   |
+----+-------+

Another version, is the delete provived by Rene Limon

delete from test
 where id not in (
    select max(id)
      from test
     group by email)

How to remove duplicate MySQL records (but only leave one)

You can use this to keep the row with the lowest id value

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;

this an example link link 1

or you can change > to < for keep the highest id

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;

this an example link link 2

Delete Duplicate Record from Same Table in MySQL