Delete Duplicate SQL Records

Which way is faster to delete duplicate rows in sql?

Not having a primary key for your table is a general bad idea. Here is one way you can delete duplicates, with the record retained per 23 columns is arbitrary:

WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY col1, col2, col3, ..., col22, col23
ORDER BY (SELECT NULL)) rn
FROM yourTable
)

DELETE
FROM cte
WHERE rn > 1;

delete duplicate records with in

If you want to delete older duplicate values, you can use:

delete from foo
where foo.id < (select max(foo2.id)
from foo foo2
where foo2.a = foo.a and foo2.b = foo.b
);

Note that an index on (a, b, id) would help performance.

You can also phrase this as a join:

delete from foo
using (select a, b, max(id) as max_id
from foo
group by a, b
) ab
where foo.a = a.a and foo.b = ab.b and foo.id < ab.max_id;

Delete Duplicate Rows in SQL

You can use a Common Table Expression to delete the duplicates:

WITH Cte AS(
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY PersonAliasId, StartDateTime, GroupId
ORDER BY ModifiedDateTIme DESC)
FROM Attendance
)
DELETE FROM Cte WHERE Rn > 1;

This will keep the most recent record for each PersonAliasId - StartDateTime - GroupId combination.

Delete duplicate records on SQL Server

You can use an updatable CTE with row-numbering to calculate which rows to delete.

You may need to adjust the partitioning and ordering clauses, it's not clear exactly what you need.

WITH cte AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY COD_MOVIMENTO, COD_PLANO_PAGAMENTO ORDER BY (SELECT 1)
FROM TB_MOVIMENTO_PDV_DETALHE_PLANO_PAGAMENTO mp
)
DELETE FROM cte
WHERE rn > 1;

How can I remove duplicate rows?

Assuming no nulls, you GROUP BY the unique columns, and SELECT the MIN (or MAX) RowId as the row to keep. Then, just delete everything that didn't have a row id:

DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, Col1, Col2, Col3
FROM MyTable
GROUP BY Col1, Col2, Col3
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL

In case you have a GUID instead of an integer, you can replace

MIN(RowId)

with

CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn)))


Related Topics



Leave a reply



Submit