How to remove duplicates based on a certain column in SQL Server?
We can use a deletable CTE along with ROW_NUMBER
here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY fid ORDER BY date DESC, name) rn
FROM yourTable
)
DELETE
FROM cte
WHERE rn > 1;
The above logic will assign rn = 1
(i.e. spare) the record with the most recent date, per group of fid
records. Should two records with the same fid
also have the same latest date, then it spares the earlier name.
How can I remove duplicate rows?
Assuming no nulls, you GROUP BY
the unique columns, and SELECT
the MIN (or MAX)
RowId as the row to keep. Then, just delete everything that didn't have a row id:
DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, Col1, Col2, Col3
FROM MyTable
GROUP BY Col1, Col2, Col3
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
In case you have a GUID instead of an integer, you can replace
MIN(RowId)
with
CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn)))
SQL query to remove duplicates from single column based on latest date
I have tried a few partitioning sql queries and also CTE but not able to get the desired result
Using QUALIFY
it could be achieved without cte:
SELECT *
FROM tab
QUALIFY ROW_NUMBER() OVER(PARTITION BY COLUMN1 ORDER BY COLUMN2 DESC) = 1
remove duplicate data from sql
You can do the following..
Creating new table and keeping random row :
first copy table
disk
(unique data) to temp tabledisk2
.drop table
disk
.rename temp table
disk2
todisk
.create table disk2 select * from disk group by d;
drop table disk;
rename table disk2 to disk;
NOTE : Here we using group by
with *
because OP does not care which row to keep.
Creating new table and keeping row with min or max id :
Another way to do this while keeping row with min
or max
id
/*copy data from disk to temp table disk2*/
create table disk2 select * from disk
where id in (select min(id) from disk group by d);
/*drop table disk*/
drop table disk;
/*rename temp table to disk*/
rename table disk2 to disk;
UPDATE: Another way to do this
Deleting duplicates from existing table
/*first create a dups table for duplicates*/
create table dups select * from disk
where id not in (select min(id) from disk group by d);
/*now delete all rows which are present in dups table*/
delete from disk where id in (select id from dups);
/*now delete the dups table*/
drop table dups;
Best way to combine two tables, remove duplicates, but keep all other non-duplicate values in SQL
If I understand your question correctly you want to join two large tables with thousands of columns that (hopefully) are the same between the two tables using the email column as the join condition and replacing duplicate records between the two tables with the records from Table 2.
I had to do something similar a few days ago so maybe you can modify my query for your purposes:
WITH only_in_table_1 AS(
SELECT *
FROM table_1 A
WHERE NOT EXISTS
(SELECT * FROM table_2 B WHERE B.email_field = A.email_field))
SELECT * FROM table_2
UNION ALL
SELECT * FROM only_in_table_1
If the columns/fields aren't the same between tables you can use a full outer join on only_in_table_1
and table_2
SQL How to remove duplicates within select query?
You mention that there are date duplicates, but it appears they're quite unique down to the precision of seconds.
Can you clarify what precision of date you start considering dates duplicate - day, hour, minute?
In any case, you'll probably want to floor your datetime field. You didn't indicate which field is preferred when removing duplicates, so this query will prefer the last name in alphabetical order.
SELECT MAX(owner_name),
--floored to the second
dateadd(second,datediff(second,'2000-01-01',start_date),'2000-01-01') AS StartDate
From MyTable
GROUP BY dateadd(second,datediff(second,'2000-01-01',start_date),'2000-01-01')
Related Topics
How to Test My Ad-Hoc SQL with Parameters in Postgres Query Window
SQL Server 2008 R2 Using Pivot with Varchar Columns Not Working
SQL 2 Counts with Different Filter
Orderby in SQL Server to Put Positive Values Before Negative Values
Parsing Openxml with Multiple Elements of the Same Name
Do Clustered Index on a Column Guarantees Returning Sorted Rows According to That Column
SQL Case: Does the Order of the When Statements Matter
Datename(Month,Getadate()) Is Returning Numeric Value of the Month as '09'
Ms Access Query with Case Statement
Average Difference Between Two Dates, Grouped by a Third Field
How to Remove Duplicate Rows Except One
Restore SQL Server Database - Failed: 38(Reached the End of the File.)
How to Force Evaluation of Subquery Before Joining/Pushing Down to Foreign Server