Delete All But One Duplicate Record

Delete all but one duplicate record

ANSI SQL Solution

Use group by in a subquery:

delete from my_tab where id not in 
(select min(id) from my_tab group by profile_id, visitor_id);

You need some kind of unique identifier(here, I'm using id).

MySQL Solution

As pointed out by @JamesPoulson, this causes a syntax error in MySQL; the correct solution is (as shown in James' answer):

delete from `my_tab` where id not in
( SELECT * FROM
(select min(id) from `my_tab` group by profile_id, visitor_id) AS temp_tab
);

Delete all but one duplicate rows in SQLite for version 3.22.0?

I think sqlite do not support alias with delete.

Try following query:

delete from Data
where exists (select 1 from Data t2
where data.code = t2.code and data.issue = t2.issue
and data.id < t2.id);

Postgres delete all duplicate records but one by sorting

Use a subquery with ROW_NUMBER and PARTITION BY to filter out rows that have duplicate regions while retaining the most recent in each region. Ensure your subquery uses the AS keyword to prevent Postgre syntax errors:

SELECT * 
FROM foo
WHERE id IN (
SELECT a.id
FROM (
SELECT id, ROW_NUMBER() OVER (
PARTITION BY region
ORDER BY created_at DESC
) row_no
FROM foo
) AS a
WHERE row_no > 1
);

... returns the rows to be deleted. Replace SELECT * with DELETE when you're satisfied with the result to delete the rows.

SQLFiddle demo

Delete all but one duplicate from a mongo db

Rather than doing all of those you can just pick first document in group for each _id: "$match._id" & make it as root document. Also, I don't think you need to do sorting in your case :

db.collection.aggregate([
{
$group: {
_id: "$match._id",
doc: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: "$doc"
}
}, {$out: 'DeleteableIds'}
])

Test : MongoDB-Playground

Delete all Duplicate Rows except for One in MySQL?

Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.

NB - You need to do this first on a test copy of your table!

When I did it, I found that unless I also included AND n1.id <> n2.id, it deleted every row in the table.

  1. If you want to keep the row with the lowest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
  2. If you want to keep the row with the highest id value:

    DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name

I used this method in MySQL 5.1

Not sure about other versions.


Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn't complete.

INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;

Delete duplicate records leaving only the latest one

You can use Count with partition by to find and insert the duplicate records into answersArchive table like following.

1- Find Duplicate and Insert into answersArchive table

--copy the duplicate records
;WITH cte
AS (SELECT id,
answer,
country_id,
question_id,
updated,
Count(*)
OVER(
partition BY question_id ) ct
FROM answers
WHERE country_id = 15)
INSERT INTO answersarchive
SELECT id,
answer,
country_id,
question_id,
updated
FROM cte
WHERE ct > 1 --Give you duplicate records

2- Delete all duplicates except the latest one.

You can use CTE to delete the records. To find the duplicate records you can use ROW_NUMBER() with PARTITION BY question_id like following query.

;WITH cte 
AS (SELECT id,
answer,
country_id,
question_id,
updated,
Row_number()
OVER(
partition BY question_id
ORDER BY updated DESC) RN
FROM answers
WHERE country_id = 15)

DELETE FROM cte
WHERE rn > 1

How to remove duplicate MySQL records (but only leave one)

You can use this to keep the row with the lowest id value

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id > e2.id AND e1.email = e2.email;

this an example link link 1

or you can change > to < for keep the highest id

DELETE e1 FROM contacts e1, contacts e2 WHERE e1.id < e2.id AND e1.email = e2.email;

this an example link link 2



Related Topics



Leave a reply



Submit