Update one of 2 duplicates in an sql server database table
Try This with CTE
and PARTITION BY
;WITH cte AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY Column1 ORDER BY Column1 ) AS rno,
Column1
FROM Clients
)
UPDATE cte SET Column1 =Column1 +' 1 '
WHERE rno=2
How to update column of duplicate row in SQL Server
Once you have a primary key in your table that uniquely identifies each row, then you can use a CTE and this approach to weed out any duplicates:
;WITH Duplicates AS
(
SELECT
UserID, Username, Password, Email, Active,
RowNum = ROW_NUMBER() OVER (PARTITION BY Username, Email ORDER BY UserID)
FROM
tUser
)
UPDATE tUser
SET Active = 0
FROM Duplicates d
WHERE d.UserID = tUser.UserID
AND d.RowNum > 1
Update all but one of duplicate records in table in SQL Server
You may solve this problem without a join, which means it should have better performance. The idea is to group the data by your object_id, counting the row number of each object_id. This is what "partition by" does. Then you can update where the row_num is > 1. This will update all duplicated object_id except the first one!
update t set t.status_val = 'some_status'
from (
select *, row_number() over(partition by object_id order by (select null)) row_num
from foo
) t
where row_num > 1
On a test table of 82944 records, the performance was such (your mileage may vary!):
Table 'test'. Scan count 5, logical reads 82283, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 141 ms, elapsed time = 150 ms.
We can certainly also solve this problem by using an inner join, however, in general this should lead to more logical reads and higher CPU:
Table 'test'. Scan count 10, logical reads 83622, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 4, logical reads 167426, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
CPU time = 342 ms, elapsed time = 233 ms.
To loop over the results and update in smaller batches:
declare @rowcount int = 1;
declare @batch_size int = 1000;
while @rowcount > 0
begin
update top(@batch_size) t set t.status_val = 'already updated'
from (
select *, row_number() over(partition by object_id order by (select null)) row_num
from foo
where status_val <> 'already updated'
) t
where row_num > 1
set @rowcount = @@rowcount;
end
This will help keep locking down if other concurrent sessions are trying to access this table.
Update a column of same table if there is duplicate records
You can use the analytical function row_number() and rowid to get the rows:
UPDATE test_dup
SET done = 'error'
WHERE ROWID IN (SELECT ROWID
FROM (SELECT acc_num, tel_num, imsi, ROW_NUMBER () OVER (PARTITION BY acc_num, tel_num, imsi ORDER BY acc_num) AS ROW_NUMBER FROM test_dup)
WHERE ROW_NUMBER > 1)
How do I update a table that references duplicate records?
I found a solution that has made this process easier. I first use Row_Number
to find duplicates in Table A and SELECT INTO
a temporary table.
SELECT
a.Id
, a.Name
, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY Id DESC) RN
INTO
#TestTable
FROM
TableA a WITH(NOLOCK)
I then JOIN
Table A and Table B to see where the ID's match and identify which ID I need to keep and which ID's I need to delete:
SELECT
b.Id
, b.Name
, b.RefId
, ToKeep.Id KeepId
, ToDelete.Id DeleteId
FROM
#TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE ToDelete.RN > 1
Then using a similar statement, I just update the records:
UPDATE b
SET
b.RefId = ToKeep.Id,
FROM #TestTable ToDelete
JOIN TableB b WITH(NOLOCK)
ON b.RefId = ToDelete.Id
JOIN #TestTable ToKeep
ON ToDelete.Name = ToKeep.Name
AND ToKeep.RN = 1
WHERE
ToDelete.RN > 1
Lastly, I can now delete the duplicate records:
DELETE a
FROM #TestTable b
INNER JOIN TableA a
ON b.Id = a.Id
WHERE
b.RN > 1
After this, you can use the same first SELECT
statement to ensure that all duplicates are deleted. Just remove the SELECT INTO
statement.
Thanks to an anonymous colleague of mine for this solution and hope this helps someone out there.
update column table(table has duplicates) with values of another column table
The duplicates are causing the error, lets get unique record for each sub_id
and CURRENT_BRAND
using row_number()
.
try below query
UPDATE SERGIU_BI_CCM_AGG_MTH t1
SET t1.CURRENT_SEC = foo.CURRENT_SEC
FROM (
SELECT t1a.sub_id
, t1a.CURRENT_BRAND
, t2.CURRENT_SEC
, t2.from_date
, t2.to_date
, row_number() over (partition by t1a.sub_id,CURRENT_BRAND order by from_date desc) as rn
FROM SERGIU_BI_CCM_AGG_MTH t1a
LEFT JOIN BI_CCM_BASE t2 ON t1a.sub_id = t2.sub_id
and t1a.CURRENT_BRAND = t2.CURRENT_BRAND
)foo
WHERE t1.sub_id = foo.sub_id
and t1.CURRENT_BRAND = foo.CURRENT_BRAND
and t1.agg_mth between to_char(foo.from_date,'YYYYMM') and to_char(foo.to_date-1,'YYYYMM');
and foo.rn = 1
Update column as Duplicate
The following does what you need using row_number to identify any group with a duplicate and an updateable CTE to check for any row that's part of a group with a duplicate:
with d as (
select *, row_number() over(partition by a,b order by a,b) dn
from t
)
update d set d.status='D'
where exists (select * from d d2 where d2.a=d.a and d2.b=d.b and d2.dn>1)
Related Topics
Oracle Convert Timestamp with Timezone to Date
MySQL Bulk Load Command Line Tool
Select Max(X) Is Returning Null; How to Make It Return 0
How to Return Default Value from SQL Query
How to Count Decimal Places in SQL
Using an Alias Column in the Where Clause in Ms-SQL 2000
Rodbc Queries Returning Zero Rows
Translating SQL Joins on Foreign Keys to R Data.Table Syntax
Rodbc SQLsave Table Creation Problems
How to Set a Default Row for a Query That Returns No Rows
Stored Procedure Exec VS Sp_Executesql Difference
Count(Id) VS. Count(*) in MySQL
Rails 3 Activerecord Query Using Both SQL in and SQL or Operators
Add a Column to a Table with a Default Value Equal to the Value of an Existing Column
Order by Descending Date - Month, Day and Year
Rodbc Loses Time Values of Datetime When Result Set Is Large