Delete duplicate rows from temp table in SQL
You can use cte and row_number to delete
;with cte as (
select *, RowN = Row_number() over (partition by assid, questionid order by answertext) from yourtable
)--or order by your id because you have not provided logic for which one to select in answertext
delete from cte where RowN > 1
Delete duplicate rows and its dependencies in other tables by using Temporary tables
create table dbo.hasduplicates
(
id int identity,
--assume colA, colB is the entity/unique combo
colA varchar(10),
colB int,
someOtherColumn varchar(40)
);
insert into dbo.hasduplicates(colA, colB, someOtherColumn)
values
('A', 1, 'A1 - 1'),
('A', 1, 'A1 - 2'),
('A', 1, 'A1 - 3'),
--
('A', 2, 'A2 - 1'),
('A', 2, 'A2 - 2'),
--
('B', 1, 'B1 - 1'),
('B', 1, 'B1 - 2'),
('B', 1, 'B1 - 3');
select *
from dbo.hasduplicates;
--temp table holding the to-be-deleted ids (of the duplicates)
create table #ToBedeleted(IdToDelete int);
with dup
as
(
select *, row_number() over (partition by colA, colB /*<--cols of your entity go here*/ order by id) as RowNum
from dbo.hasduplicates
)
insert into #ToBedeleted(IdToDelete)
select Id
from dup
where RowNum >= 2;
--contains the ids for deletion
select * from #ToBedeleted;
--cleanup the referencing tables
/*
DELETE FROM dbo.Table1 WHERE Table1Id IN (SELECT IdToDelete FROM #ToBedeleted);
DELETE FROM dbo.Table2 WHERE Table2Id IN (SELECT IdToDelete FROM #ToBedeleted);
.............
DELETE FROM dbo.Table6 WHERE Table6Id IN (SELECT IdToDelete FROM #ToBedeleted);
--finally cleanup your products table
DELETE FROM dbo.hasduplicates WHERE Id IN (SELECT IdToDelete FROM #ToBedeleted);
*/
--/*
drop table #ToBedeleted;
drop table dbo.hasduplicates;
--*/
Deleting duplicate records using a temporary table
Where we have the set of code for --delete all rows that are duplicated, that gets rid of the duplicates so what's the part of the last section?
First, it deletes all rows that ever had duplicates. That is, all rows, and original also. In the case above, only one row ('not duplicate row'
) will remain in the table after DELETE
. All four other rows will be deleted.
Then is populates the table with the deleted rows again, but now the duplicates are removed.
This is not the best way to delete duplicates.
The best way is:
WITH q AS (
SELECT data, ROW_NUMBER() OVER (PARTITION BY data ORDER BY data) AS rn
FROM @table
)
DELETE
FROM q
WHERE rn > 1
Remove duplicate fields from a temp table that has no primary key
Well, I'm late to the party, but here is a database agnostic solution:
SELECT A.*
FROM YourTable A
INNER JOIN (SELECT [First], [Last], MAX(DOB) MaxDob
FROM YourTable
GROUP BY [First], [Last]) B
ON A.[First] = B.[First]
AND A.[Last] = B.[Last]
AND A.DOB = B.MaxDob
And here is a sqlfiddle with a demo for it. (Thanks @JW for the schema of the fiddle)
Delete duplicate records without creating a temporary table
Here's an in-place solution (but not one-liner)
Find out max id:
select max(id) as maxid
from shop;
Remember this value. Let's say it equals to 1000;
Re-insert unique values, with offset:
insert into shop (id, tax_id)
select distinct id + 1000, tax_id
from shop;
Drop old values:
delete from shop
where id <= 1000;
Restore normal ids:
update shop
set id = id - 1000;
PROFIT!
Deleting massive number of duplicate records without using a new table
In SQL Server, you can delete in batches. Although this is not the most efficient code, it illustrates the idea of deleting in batches:
DECLARE @go_on INT
SELECT @go_on = 1;
WHILE (@go_on = 1)
BEGIN
WITH TODELETE AS (
SELECT TOP (10000) d1.*
FROM (SELECT d1.*,
ROW_NUMBER() OVER (PARTITION BY d_record, d_d2id ORDER BY d_id) as seqnum
FROM d1
WHERE d_d2id >= 25 AND d_d2id <= 28
) d1
WHERE seqnum > 1
)
DELETE FROM TODELETE;
SET @go_on = (CASE WHEN @@ROWCOUNT > 0 THEN 1 ELSE 0 END);
END;
It would be more efficient to store the rows to be deleted in a temporary table or table variable, so they don't need to be recalculated each time.
Related Topics
Difference Between SQL Connection and Oledb Connection
Can SQL Clr Triggers Do This? or Is There a Better Way
Sql:Remove Last Comma in String
Is Not Null Test for a Record Does Not Return True When Variable Is Set
How to Extract Multiple Strings from Single Rows in SQL Server
Create a SQL View Based Converting Ranges into Rows
Calculate Difference Between Start_Time and End_Time in Seconds from Unix_Time Yyyy-Mm-Dd Hh:Mm:Ss
Tips and Tricks to Speed Up an SQL
Select Items Like Records from a Column in Another Table
Delete Records Within Instead of Delete Trigger
SQL Count to Include Zero Values
SQL Query - Delete Duplicates If More Than 3 Dups
Recursive Cte Stop Condition for Loops
Good Database and Structure to Store Synonyms
Why SQL Server Ignores Vaules in String Concatenation When Order by Clause Specified
SQL Inner Join Over Multiple Tables Equal to Where Syntax
Is There a Function That Takes a Year, Month and Day to Create a Date in Postgresql