Remove duplicate rows in MySQL
A really easy way to do this is to add a UNIQUE
index on the 3 columns. When you write the ALTER
statement, include the IGNORE
keyword. Like so:
ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);
This will drop all the duplicate rows. As an added benefit, future INSERTs
that are duplicates will error out. As always, you may want to take a backup before running something like this...
How do I delete all the duplicate records in a MySQL table without temp tables
Add Unique Index on your table:
ALTER IGNORE TABLE `TableA`
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);
Another way to do this would be:
Add primary key in your table then you can easily remove duplicates from your table using the following query:
DELETE FROM member
WHERE id IN (SELECT *
FROM (SELECT id FROM member
GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
) AS A
);
Delete all Duplicate Rows except for One in MySQL?
Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.
NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also included AND n1.id <> n2.id
, it deleted every row in the table.
If you want to keep the row with the lowest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
If you want to keep the row with the highest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE
, please be advised that using INSERT
and DISTINCT
is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE
, it took more than 2 hours and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;
How to delete duplicates from one table, but keeping only one record?
I found out the exact reason of issue I faced finally.
I referenced the comment of @Malakiyasanjay.
you can find that from here How to keep only one row of a table, removing duplicate rows?
I tried like this: (and it worked for me as well but it took a lot of time to run the query for 30,000 rows)
delete from myTable
where id not in
(select min(id) as min from (select * from myTable) as x group by title)
The problem was I couldn't specify the 'myTable' table as a target table. so I used (select * from myTable) as x
and figured it out.
I am sorry I can't explain more detail about that because I am not familiar with mysql query. But you should note that:
MySql does not allow the direct use of the target table inside a subquery like the one you use with NOT IN, but you can overcome this limitation by enclosing the subquery inside another one.
(Please reference @forpas 's answer.)
But you have to notice this takes so long time... It might cause the time out error. I ran this query for table with about 600,000 rows but it didn't response for several days. So I conclude this idea is pretty fit to small database table.
I hope this is helpful for everyone! :)
Delete duplicate rows in MySQL based on contents of another table
@MHardwick and @ShadowRay almost got it right. The following also checks to make sure the email exists more tan once in tb_email_to_members
DELETE FROM tb_email_to_members
WHERE email_id NOT IN (SELECT frn_email_id FROM tb_tx)
AND email_address IN (SELECT email_address FROM tb_email_to_members GROUP BY email_address HAVING COUNT(email_address) > 1);
And obviously changing DELETE
to SELECT *
will show you what exactly you're about to delete.
Bonus points for knowing tb
is short for tidbits?
Related Topics
Why Does This SQL Code Give Error 1066 (Not Unique Table/Alias: 'User')
Subquery in from Must Have an Alias
Mysql, Iterate Through Column Names
Why Use Select Top 100 Percent
SQL Server Convert String to Datetime
How to Delete from Select in MySQL
What Does Sp_Reset_Connection Do
Why Does a Like Query in Access Not Return Any Records
What Datatype to Use When Storing Latitude and Longitude Data in SQL Databases
Composite Primary Key VS Additional "Id" Column
How to Have an Indexed View in MySQL
Need to List All Triggers in SQL Server Database with Table Name and Table's Schema
Unresolved Reference to Object [Information_Schema].[Tables]
Does Sparksql Support Subquery