Update One of 2 Duplicates in an SQL Server Database Table

Update one of 2 duplicates in an sql server database table

Try This with CTE and PARTITION BY

;WITH cte AS
(
  SELECT
      ROW_NUMBER() OVER(PARTITION BY Column1  ORDER BY Column1 ) AS rno,
      Column1 
  FROM Clients
)

UPDATE cte SET Column1 =Column1 +' 1 '
WHERE rno=2

How to update column of duplicate row in SQL Server

Once you have a primary key in your table that uniquely identifies each row, then you can use a CTE and this approach to weed out any duplicates:

;WITH Duplicates AS
(
    SELECT
        UserID, Username, Password, Email, Active,
        RowNum = ROW_NUMBER() OVER (PARTITION BY Username, Email ORDER BY UserID)
    FROM
        tUser
)
UPDATE tUser
SET Active = 0
FROM Duplicates d
WHERE d.UserID = tUser.UserID
  AND d.RowNum > 1

Update all but one of duplicate records in table in SQL Server

You may solve this problem without a join, which means it should have better performance. The idea is to group the data by your object_id, counting the row number of each object_id. This is what "partition by" does. Then you can update where the row_num is > 1. This will update all duplicated object_id except the first one!

update t set t.status_val = 'some_status' 
from (
    select *, row_number() over(partition by object_id order by (select null)) row_num  
    from foo
) t 
where row_num > 1

On a test table of 82944 records, the performance was such (your mileage may vary!):
Table 'test'. Scan count 5, logical reads 82283, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 141 ms, elapsed time = 150 ms.

We can certainly also solve this problem by using an inner join, however, in general this should lead to more logical reads and higher CPU:

Table 'test'. Scan count 10, logical reads 83622, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 4, logical reads 167426, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 342 ms, elapsed time = 233 ms.

To loop over the results and update in smaller batches:

declare @rowcount int = 1;
declare @batch_size int = 1000;

while @rowcount > 0 
begin
    update top(@batch_size) t set t.status_val = 'already updated'
    from (
        select *, row_number() over(partition by object_id order by (select null)) row_num  
        from foo
        where status_val <> 'already updated' 
    ) t 
    where row_num > 1 
    set @rowcount = @@rowcount;
end

This will help keep locking down if other concurrent sessions are trying to access this table.

Update a column of same table if there is duplicate records

You can use the analytical function row_number() and rowid to get the rows:

UPDATE test_dup
   SET done = 'error'
 WHERE ROWID IN (SELECT ROWID
                   FROM (SELECT acc_num, tel_num, imsi, ROW_NUMBER () OVER (PARTITION BY acc_num, tel_num, imsi ORDER BY acc_num) AS ROW_NUMBER FROM test_dup)
                  WHERE ROW_NUMBER > 1)

How do I update a table that references duplicate records?

I found a solution that has made this process easier. I first use Row_Number to find duplicates in Table A and SELECT INTO a temporary table.

SELECT
       a.Id
     , a.Name
     , ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Id DESC) RN
INTO
     #TestTable
FROM
     TableA a WITH(NOLOCK)

I then JOIN Table A and Table B to see where the ID's match and identify which ID I need to keep and which ID's I need to delete:

SELECT
       b.Id
     , b.Name
     , b.RefId
     , ToKeep.Id   KeepId
     , ToDelete.Id DeleteId
FROM
     #TestTable ToDelete
     JOIN TableB b WITH(NOLOCK)
        ON b.RefId = ToDelete.Id
     JOIN #TestTable ToKeep
        ON ToDelete.Name = ToKeep.Name
           AND ToKeep.RN = 1
WHERE ToDelete.RN > 1

Then using a similar statement, I just update the records:

UPDATE b
SET
    b.RefId = ToKeep.Id,
FROM #TestTable ToDelete
     JOIN TableB b WITH(NOLOCK)
        ON b.RefId = ToDelete.Id
     JOIN #TestTable ToKeep
        ON ToDelete.Name = ToKeep.Name
           AND ToKeep.RN = 1
WHERE
      ToDelete.RN > 1

Lastly, I can now delete the duplicate records:

DELETE a
FROM #TestTable b
     INNER JOIN TableA a
        ON b.Id = a.Id
WHERE
      b.RN > 1

After this, you can use the same first SELECT statement to ensure that all duplicates are deleted. Just remove the SELECT INTO statement.

Thanks to an anonymous colleague of mine for this solution and hope this helps someone out there.

update column table(table has duplicates) with values of another column table

The duplicates are causing the error, lets get unique record for each sub_id and CURRENT_BRAND using row_number().

try below query

UPDATE SERGIU_BI_CCM_AGG_MTH t1
SET t1.CURRENT_SEC = foo.CURRENT_SEC
FROM (
        SELECT t1a.sub_id
            , t1a.CURRENT_BRAND
            , t2.CURRENT_SEC
            , t2.from_date
            , t2.to_date
            , row_number() over (partition by t1a.sub_id,CURRENT_BRAND order by from_date desc) as rn
        FROM SERGIU_BI_CCM_AGG_MTH t1a
        LEFT JOIN BI_CCM_BASE t2 ON t1a.sub_id = t2.sub_id 
            and t1a.CURRENT_BRAND = t2.CURRENT_BRAND
      )foo 
WHERE t1.sub_id = foo.sub_id 
    and t1.CURRENT_BRAND = foo.CURRENT_BRAND    
    and t1.agg_mth between to_char(foo.from_date,'YYYYMM') and to_char(foo.to_date-1,'YYYYMM');
    and foo.rn = 1

Update column as Duplicate

The following does what you need using row_number to identify any group with a duplicate and an updateable CTE to check for any row that's part of a group with a duplicate:

with d as (
    select *, row_number() over(partition by a,b order by a,b) dn
    from t
)
update d set d.status='D'
where exists (select * from d d2 where d2.a=d.a and d2.b=d.b and d2.dn>1)

Update One of 2 Duplicates in an SQL Server Database Table