How to Find Duplicate Entries and Delete the Oldest Ones in SQL

How can I find duplicate entries and delete the oldest ones in SQL?

Find duplicates and delete oldest one

alt text

Here is the Code

create table #Product (
    ID      int identity(1, 1) primary key,
    Name        varchar(800),
    DateAdded   datetime default getdate()
)

insert  #Product(Name) select 'Chocolate'
insert  #Product(Name,DateAdded) select 'Candy', GETDATE() + 1
insert  #Product(Name,DateAdded) select 'Chocolate', GETDATE() + 5
select * from #Product

;with Ranked as (
    select  ID, 
        dense_rank() 
        over (partition by Name order by DateAdded desc) as DupeCount
    from    #Product P
)
delete  R
from    Ranked R
where   R.DupeCount > 1

select * from #Product

TSQL - Deleting a duplicate oldest value

I like to use an updatable CTE for this:

with cte as (
    select row_number() over(partition by userid order by logintim desc) rn
    from mytable
)
delete from cte where rn > 1

For each userid, this retains the row with the most recent logintim and deletes the others (if any).

Oracle SQL query: Delete the oldest duplicate records from the table

You could use this query:

delete
from   users
where  (id, time) in 
       (select id, time
        from   (select id, time, 
                       row_number() over (partition by id order by time desc) as rn,
                from   users) sub
        where   rn > 1)

It will delete all "duplicates" for a certain person, except the most recent one. The idea is that when you number the occurrences of a certain id, from recent to old, only the records numbered with 1 should be kept.

Pseudo-column `ROWID`

As guigui42 mentioned in comments below, the Oracle specific pseudo-column rowid may give a further performance improvement. This would be certainly the case if you do not have an index starting with the id, time fields:

delete
from   users
where  rowid in 
       (select rowid
        from   (select rowid, 
                       row_number() over (partition by id order by time desc) as rn,
                from   users) sub
        where   rn > 1)

Delete duplicate data in SQL preserving the oldest entry

Declare @YourTable table (col1 varchar(25),col2 varchar(25))
Insert Into @YourTable values
('0010','1111'),
('0011','1112'),
('0012','1111')

;with cteBase as (
    Select *,RowNr=Row_Number() over (Partition By Col2 Order By Col1 Desc) from @YourTable
)
Select * From cteBase where RowNr>1
-- Delete From cteBase where RowNr>1
-- Remove Select if satisfied with results

Record(s) to be Deleted

col1    col2    RowNr
0010    1111    2

Remove oldest duplicates and keep latest duplicate by time stamp

Of course you can use timestamp column with ROW_NUMBER() & you don't need to use recursion hint as your CTE has not any recursion level.

;WITH Duplicates AS  (
       SELECT *, 
              ROW_NUMBER() OVER (PARTITION BY ChannelName, SerialNumber, ReadingDate ORDER BY timestamp DESC) AS Rownumber
       FROM [Staging].[UriData]        
 ) 

DELETE d
FROM Duplicates d
WHERE Rownumber > 1;

Delete duplicate records with oldest GUID

If you want to delete duplicate records but without a timestamp and GUID as Primary Key you can try to use %%physloc%%.

%%physloc%% is a Virtual Column. This column is undocumented. Hence you can use it at you own risk.

WITH cte 
     AS(SELECT ID, ROW_NUMBER() OVER(PARTITION BY CodeOne, CodeTwo
     ORDER BY  %%physloc%%) RN
 FROM [InvoiceDatabase].[dbo].[LookUpCode])
 DELETE FROM cte
WHERE  RN > 1;

sqlfiddle

Note

But I would suggest you create a timestamp for the order.

Delete duplicate rows with older/newer date

You can use a where clause in the delete:

delete from t
    where t.datetime > (select min(t2.datetime)
                        from t t2
                        where t2.id = t.id
                       );

If you don't actually want to delete the rows, but just run a query and not see the red ones:

select id, min(datetime)
from t
group by id;

How to Find Duplicate Entries and Delete the Oldest Ones in SQL

How can I find duplicate entries and delete the oldest ones in SQL?

TSQL - Deleting a duplicate oldest value

Oracle SQL query: Delete the oldest duplicate records from the table

Pseudo-column `ROWID`

Delete duplicate data in SQL preserving the oldest entry

Remove oldest duplicates and keep latest duplicate by time stamp

Delete duplicate records with oldest GUID

Delete duplicate rows with older/newer date

Related Topics

Leave a reply

How can I find duplicate entries and delete the oldest ones in SQL?

TSQL - Deleting a duplicate oldest value

Oracle SQL query: Delete the oldest duplicate records from the table

Pseudo-column ROWID

Delete duplicate data in SQL preserving the oldest entry

Remove oldest duplicates and keep latest duplicate by time stamp

Delete duplicate records with oldest GUID

Delete duplicate rows with older/newer date

Related Topics

Leave a reply

Pseudo-column `ROWID`