How to Delete Duplicates in MySQL Table

Remove duplicate rows in MySQL

A really easy way to do this is to add a UNIQUE index on the 3 columns. When you write the ALTER statement, include the IGNORE keyword. Like so:

ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);

This will drop all the duplicate rows. As an added benefit, future INSERTs that are duplicates will error out. As always, you may want to take a backup before running something like this...

How do I delete all the duplicate records in a MySQL table without temp tables

Add Unique Index on your table:

ALTER IGNORE TABLE `TableA`   
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);

Another way to do this would be:

Add primary key in your table then you can easily remove duplicates from your table using the following query:

DELETE FROM member  
WHERE id IN (SELECT *
FROM (SELECT id FROM member
GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
) AS A
);

Delete all Duplicate Rows except for One in MySQL?


Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

  1. If you want to keep the row with the lowest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
  2. If you want to keep the row with the highest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.


Update: Since people Googling for removing duplicates end up here

Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;

How to delete duplicates from one table, but keeping only one record?

I found out the exact reason of issue I faced finally.
I referenced the comment of @Malakiyasanjay.
you can find that from here How to keep only one row of a table, removing duplicate rows?

I tried like this: (and it worked for me as well but it took a lot of time to run the query for 30,000 rows)

delete from myTable
where id not in
(select min(id) as min from (select * from myTable) as x group by title)

The problem was I couldn't specify the 'myTable' table as a target table. so I used (select * from myTable) as x and figured it out.

I am sorry I can't explain more detail about that because I am not familiar with mysql query. But you should note that:

MySql does not allow the direct use of the target table inside a subquery like the one you use with NOT IN, but you can overcome this limitation by enclosing the subquery inside another one.
(Please reference @forpas 's answer.)

But you have to notice this takes so long time... It might cause the time out error. I ran this query for table with about 600,000 rows but it didn't response for several days. So I conclude this idea is pretty fit to small database table.

I hope this is helpful for everyone! :)

Delete duplicate rows in MySQL based on contents of another table

@MHardwick and @ShadowRay almost got it right. The following also checks to make sure the email exists more tan once in tb_email_to_members

DELETE FROM tb_email_to_members
WHERE email_id NOT IN (SELECT frn_email_id FROM tb_tx)
AND email_address IN (SELECT email_address FROM tb_email_to_members GROUP BY email_address HAVING COUNT(email_address) > 1);

And obviously changing DELETE to SELECT * will show you what exactly you're about to delete.

Bonus points for knowing tb is short for tidbits?



Related Topics



Leave a reply



Submit