Remove Duplicate Rows in MySQL

Remove duplicate rows in MySQL

A really easy way to do this is to add a UNIQUE index on the 3 columns. When you write the ALTER statement, include the IGNORE keyword. Like so:

ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);

This will drop all the duplicate rows. As an added benefit, future INSERTs that are duplicates will error out. As always, you may want to take a backup before running something like this...

Delete all Duplicate Rows except for One in MySQL?

Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

  1. If you want to keep the row with the lowest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
  2. If you want to keep the row with the highest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.


Update: Since people Googling for removing duplicates end up here

Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;

Remove duplicate rows from mysql table result

If you expect your query to return the number of duplicates then no it is not correct.

The condition t1.id < t2.id will join every id of t1 with all ids from t2 that are greater resulting on more rows or less rows (in the case of only 2 duplicates) and rarely in the expected number.

See the demo.

If you want to see all the duplicates:

select * from consignment t
where t.service = 'CLRC'
and exists (
select 1 from consignment
where service = t.service and id <> t.id and hawb = t.hawb
)

See the demo.

If you want to delete the duplicates and keep only the one ones with the max id for each hawb then:

delete from consignment
where service='CLRC'
and id not in (
select id from (
select max(id) id from consignment
where service='CLRC'
group by hawb
) t
);

See the demo.

MySQL use DELETE FROM to remove duplicates rows

First, this is a very bad way of implementing this code. But I guess you get what you pay for.

Second, simply run the query as a select:

SELECT p1.*, p2.*
FROM Person p1 JOIN
Person p2
ON p1.Email = p2.Email AND p1.Id > p2.Id;

(Note that I've rewritten the logic as a JOIN. You should always use proper, explicit, standard, readable JOIN syntax, but the two methods are functionally equivalent.)

On your second example, the results of this query are:

table1 email     table1 id    table2 id
john@example.com. 2 1
john@example.com. 3 1
john@example.com. 3 2

What is notable is that id = 1 is never in the second column -- and that is the column that determines which ids are deleted. In other words, all but the smallest id for each email get deleted because there is a smaller id.

This also hints at why this is a really bad solution. MySQL has to deal with two rows for id = 3. Perhaps it attempts to delete both. Perhaps it has to just deal with extra data. Either way, there is extra work. And the more rows with the same email in the data the more extra duplicates are created.

An alternative method, such as:

delete p
from person p join
(select email, min(id) as min_id
from person p2
group by email
) p2
on p.email = p2.email and p.id > p2.min_id;

Does not have this problem and, in my opinion, the intent is clearer.

Removing duplicate rows (MySQL 5.7)

The alias c1 needs to be referenced between DELETE and FROM for MySql.

Try this way:

DELETE FROM pmd_listings
WHERE id NOT IN (
SELECT t.id FROM (
SELECT MIN(id) id
FROM pmd_listings
GROUP BY friendly_url, www, listing_zip
) t
)

How to remove duplicate MySQL records (but only leave one)

You can use this to keep the row with the lowest id value

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;

this an example link link 1

or you can change > to < for keep the highest id

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;

this an example link link 2



Related Topics



Leave a reply



Submit