Remove duplicate rows in MySQL
A really easy way to do this is to add a UNIQUE
index on the 3 columns. When you write the ALTER
statement, include the IGNORE
keyword. Like so:
ALTER IGNORE TABLE jobs
ADD UNIQUE INDEX idx_name (site_id, title, company);
This will drop all the duplicate rows. As an added benefit, future INSERTs
that are duplicates will error out. As always, you may want to take a backup before running something like this...
Delete all Duplicate Rows except for One in MySQL?
Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.
NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also included AND n1.id <> n2.id
, it deleted every row in the table.
If you want to keep the row with the lowest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
If you want to keep the row with the highest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE
, please be advised that using INSERT
and DISTINCT
is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE
, it took more than 2 hours and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;
Remove duplicate rows from mysql table result
If you expect your query to return the number of duplicates then no it is not correct.
The condition t1.id < t2.id
will join every id
of t1
with all id
s from t2
that are greater resulting on more rows or less rows (in the case of only 2 duplicates) and rarely in the expected number.
See the demo.
If you want to see all the duplicates:
select * from consignment t
where t.service = 'CLRC'
and exists (
select 1 from consignment
where service = t.service and id <> t.id and hawb = t.hawb
)
See the demo.
If you want to delete the duplicates and keep only the one ones with the max id
for each hawb
then:
delete from consignment
where service='CLRC'
and id not in (
select id from (
select max(id) id from consignment
where service='CLRC'
group by hawb
) t
);
See the demo.
MySQL use DELETE FROM to remove duplicates rows
First, this is a very bad way of implementing this code. But I guess you get what you pay for.
Second, simply run the query as a select
:
SELECT p1.*, p2.*
FROM Person p1 JOIN
Person p2
ON p1.Email = p2.Email AND p1.Id > p2.Id;
(Note that I've rewritten the logic as a JOIN
. You should always use proper, explicit, standard, readable JOIN
syntax, but the two methods are functionally equivalent.)
On your second example, the results of this query are:
table1 email table1 id table2 id
john@example.com. 2 1
john@example.com. 3 1
john@example.com. 3 2
What is notable is that id = 1 is never in the second column -- and that is the column that determines which ids are deleted. In other words, all but the smallest id for each email get deleted because there is a smaller id
.
This also hints at why this is a really bad solution. MySQL has to deal with two rows for id = 3
. Perhaps it attempts to delete both. Perhaps it has to just deal with extra data. Either way, there is extra work. And the more rows with the same email in the data the more extra duplicates are created.
An alternative method, such as:
delete p
from person p join
(select email, min(id) as min_id
from person p2
group by email
) p2
on p.email = p2.email and p.id > p2.min_id;
Does not have this problem and, in my opinion, the intent is clearer.
Removing duplicate rows (MySQL 5.7)
The alias c1
needs to be referenced between DELETE and FROM for MySql.
Try this way:
DELETE FROM pmd_listings
WHERE id NOT IN (
SELECT t.id FROM (
SELECT MIN(id) id
FROM pmd_listings
GROUP BY friendly_url, www, listing_zip
) t
)
How to remove duplicate MySQL records (but only leave one)
You can use this to keep the row with the lowest id value
DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;
this an example link link 1
or you can change >
to <
for keep the highest id
DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;
this an example link link 2
Related Topics
Insert into ... Values ( Select ... from ... )
Sql, Auxiliary Table of Numbers
Return Row With the Max Value of One Column Per Group
Error 1452: Cannot Add or Update a Child Row: a Foreign Key Constraint Fails
Null in MySQL (Performance & Storage)
How to Calculate Percentage With a SQL Statement
Remove Trailing Zeros from Decimal in SQL Server
How to Filter SQL Results in a Has-Many-Through Relation
Sort by Column Asc, But Null Values First
Select First Row of Every Group in Sql
SQL Server 2012 Column Identity Increment Jumping from 6 to 1000+ on 7Th Entry