Deleting Duplicate Records Using a Temporary Table

Delete duplicate rows from temp table in SQL

You can use cte and row_number to delete

;with cte as (
select *, RowN = Row_number() over (partition by assid, questionid order by answertext) from yourtable
)--or order by your id because you have not provided logic for which one to select in answertext
delete from cte where RowN > 1

Delete duplicate rows and its dependencies in other tables by using Temporary tables

create table dbo.hasduplicates
(
id int identity,
--assume colA, colB is the entity/unique combo
colA varchar(10),
colB int,
someOtherColumn varchar(40)
);

insert into dbo.hasduplicates(colA, colB, someOtherColumn)
values
('A', 1, 'A1 - 1'),
('A', 1, 'A1 - 2'),
('A', 1, 'A1 - 3'),
--
('A', 2, 'A2 - 1'),
('A', 2, 'A2 - 2'),
--
('B', 1, 'B1 - 1'),
('B', 1, 'B1 - 2'),
('B', 1, 'B1 - 3');

select *
from dbo.hasduplicates;

--temp table holding the to-be-deleted ids (of the duplicates)
create table #ToBedeleted(IdToDelete int);

with dup
as
(
select *, row_number() over (partition by colA, colB /*<--cols of your entity go here*/ order by id) as RowNum
from dbo.hasduplicates
)
insert into #ToBedeleted(IdToDelete)
select Id
from dup
where RowNum >= 2;

--contains the ids for deletion
select * from #ToBedeleted;

--cleanup the referencing tables
/*
DELETE FROM dbo.Table1 WHERE Table1Id IN (SELECT IdToDelete FROM #ToBedeleted);
DELETE FROM dbo.Table2 WHERE Table2Id IN (SELECT IdToDelete FROM #ToBedeleted);
.............
DELETE FROM dbo.Table6 WHERE Table6Id IN (SELECT IdToDelete FROM #ToBedeleted);
--finally cleanup your products table
DELETE FROM dbo.hasduplicates WHERE Id IN (SELECT IdToDelete FROM #ToBedeleted);
*/

--/*
drop table #ToBedeleted;
drop table dbo.hasduplicates;
--*/

Deleting duplicate records using a temporary table

Where we have the set of code for --delete all rows that are duplicated, that gets rid of the duplicates so what's the part of the last section?

First, it deletes all rows that ever had duplicates. That is, all rows, and original also. In the case above, only one row ('not duplicate row') will remain in the table after DELETE. All four other rows will be deleted.

Then is populates the table with the deleted rows again, but now the duplicates are removed.

This is not the best way to delete duplicates.

The best way is:

WITH q AS (
SELECT data, ROW_NUMBER() OVER (PARTITION BY data ORDER BY data) AS rn
FROM @table
)
DELETE
FROM q
WHERE rn > 1

Remove duplicate fields from a temp table that has no primary key

Well, I'm late to the party, but here is a database agnostic solution:

SELECT A.*
FROM YourTable A
INNER JOIN (SELECT [First], [Last], MAX(DOB) MaxDob
FROM YourTable
GROUP BY [First], [Last]) B
ON A.[First] = B.[First]
AND A.[Last] = B.[Last]
AND A.DOB = B.MaxDob

And here is a sqlfiddle with a demo for it. (Thanks @JW for the schema of the fiddle)

Delete duplicate records without creating a temporary table

Here's an in-place solution (but not one-liner)

Find out max id:

select max(id) as maxid 
from shop;

Remember this value. Let's say it equals to 1000;

Re-insert unique values, with offset:

insert into shop (id, tax_id) 
select distinct id + 1000, tax_id
from shop;

Drop old values:

delete from shop
where id <= 1000;

Restore normal ids:

update shop
set id = id - 1000;

PROFIT!

Deleting massive number of duplicate records without using a new table

In SQL Server, you can delete in batches. Although this is not the most efficient code, it illustrates the idea of deleting in batches:

DECLARE @go_on INT
SELECT @go_on = 1;

WHILE (@go_on = 1)
BEGIN
WITH TODELETE AS (
SELECT TOP (10000) d1.*
FROM (SELECT d1.*,
ROW_NUMBER() OVER (PARTITION BY d_record, d_d2id ORDER BY d_id) as seqnum
FROM d1
WHERE d_d2id >= 25 AND d_d2id <= 28
) d1
WHERE seqnum > 1
)
DELETE FROM TODELETE;

SET @go_on = (CASE WHEN @@ROWCOUNT > 0 THEN 1 ELSE 0 END);
END;

It would be more efficient to store the rows to be deleted in a temporary table or table variable, so they don't need to be recalculated each time.



Related Topics



Leave a reply



Submit