How to delete duplicate rows without unique identifier
I like @erwin-brandstetter 's solution, but wanted to show a solution with the USING
keyword:
DELETE FROM table_with_dups T1
USING table_with_dups T2
WHERE T1.ctid < T2.ctid -- delete the "older" ones
AND T1.name = T2.name -- list columns that define duplicates
AND T1.address = T2.address
AND T1.zipcode = T2.zipcode;
If you want to review the records before deleting them, then simply replace DELETE
with SELECT *
and USING
with a comma ,
, i.e.
SELECT * FROM table_with_dups T1
, table_with_dups T2
WHERE T1.ctid < T2.ctid -- select the "older" ones
AND T1.name = T2.name -- list columns that define duplicates
AND T1.address = T2.address
AND T1.zipcode = T2.zipcode;
Update: I tested some of the different solutions here for speed. If you don't expect many duplicates, then this solution performs much better than the ones that have a NOT IN (...)
clause as those generate a lot of rows in the subquery.
If you rewrite the query to use IN (...)
then it performs similarly to the solution presented here, but the SQL code becomes much less concise.
Update 2: If you have NULL
values in one of the key columns (which you really shouldn't IMO), then you can use COALESCE()
in the condition for that column, e.g.
AND COALESCE(T1.col_with_nulls, '[NULL]') = COALESCE(T2.col_with_nulls, '[NULL]')
deleting duplicate row with no unique identifier
Here is a query that will remove duplicates and leave exactly one copy of each unique row. It will work with SQL Server 2005 or higher:
WITH Dups AS
(
SELECT tickId, timestamp, price,
ROW_NUMBER() OVER(PARTITION BY tickid, timestamp ORDER BY (SELECT 0)) AS rn
FROM stockData
)
DELETE FROM Dups WHERE rn > 1
SQL Server : delete duplicate rows without Unique ID
Use CTE
with row_number
to delete the duplicates
;with cte as
(
select *,row_number() over(order by pkID) RN
FROM yourtable
where pkID = 44
)
delete from cte where RN>1
Note: In order by
you can mention the in which order
you want to delete the duplicates
Delete duplicate records from a Postgresql table without a primary key?
Copy distinct data to work table fk_payment1_copy
. The simplest way to do that is to use into
SELECT max(id),settlement_ref_no ...
INTO fk_payment1_copy
from fk_payment1
GROUP BY settlement_ref_no ...
delete all rows from fk_payment1
delete from fk_payment1
and copy data from fk_payment1_copy
table to fk_payment1
insert into fk_payment1
select id,settlement_ref_no ...
from fk_payment1_copy
Deleting duplicates on column without primary keys or unique constraints
You can use the ctid system column to differentiate the rows:
DELETE FROM your_table t1
USING your_table t2
WHERE t1 = t2
AND t1.ctid > t2.ctid;
How to remove duplicates in postgres (no unique id)
Each table in Postgres has a few hidden system columns. One of them (ctid)
is unique by definition and can be used in cases when a primary key is missing.
DELETE FROM tablename a
USING tablename b
WHERE a.ctid < b.ctid
AND a.user_id = b.user_id
AND a.time_id = b.time_id;
The problem is due to lack of primary key. Using hidden columns should not be a systematic method (see comments below). Once you delete duplicates you should create a primary key on (user_id, time_id)
or create a new unique column for this purpose.
Delete duplicate rows from table with no unique key
If you can afford to rewrite the whole table, this is probably the simplest approach:
WITH Deleted AS (
DELETE FROM discogs.releases_labels
RETURNING *
)
INSERT INTO discogs.releases_labels
SELECT DISTINCT * FROM Deleted
If you need to specifically target the duplicated records, you can make use of the internal ctid
field, which uniquely identifies a row:
DELETE FROM discogs.releases_labels
WHERE ctid NOT IN (
SELECT MIN(ctid)
FROM discogs.releases_labels
GROUP BY label, release_id, catno
)
Be very careful with ctid
; it changes over time. But you can rely on it staying the same within the scope of a single statement.
Related Topics
Convert Varchar into Datetime in SQL Server
Escaping Keyword-Like Column Names in Postgres
Converting Epoch Timestamp to SQL Server(Human Readable Format)
SQL Server - Transactions Roll Back on Error
SQL Selecting from Two Tables With Inner Join and Limit
MySQL: View With Subquery in the from Clause Limitation
Where Value in Column Containing Comma Delimited Values
Simulate Create Database If Not Exists For Postgresql
Can a Check Constraint Relate to Another Table
MySQL Unknown Column in on Clause
Calculating Cumulative Sum in Postgresql
Ora-00918: Column Ambiguously Defined in Select *
Key Value Pairs in Relational Database
Entity Framework VS Linq to SQL VS Ado.Net With Stored Procedures
Can a Foreign Key Reference a Non-Unique Index