Sql How to Remove Duplicates Within Select Query

SQL How to remove duplicates within select query?

You mention that there are date duplicates, but it appears they're quite unique down to the precision of seconds.

Can you clarify what precision of date you start considering dates duplicate - day, hour, minute?

In any case, you'll probably want to floor your datetime field. You didn't indicate which field is preferred when removing duplicates, so this query will prefer the last name in alphabetical order.

 SELECT MAX(owner_name), 
--floored to the second
dateadd(second,datediff(second,'2000-01-01',start_date),'2000-01-01') AS StartDate
From MyTable
GROUP BY dateadd(second,datediff(second,'2000-01-01',start_date),'2000-01-01')

Remove duplicates in Select query based on one column

You can also use ROW_NUMBER():

SELECT id, name
FROM (
SELECT id, name, ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) rn
FROM mytable
) x
WHERE rn = 1

This will retain the record that has the smallest name (so '5d' will come before '5e'). With this technique, you can also use a sort criteria on another column that the one where duplicates exists (which an aggregate query with MIN() cannot do). Also, queries using window functions usually perform better than the equivalent aggregate query.

SQL Server - How to remove duplicates within select query?

You can write this in a subquery as below

delete from yourtable where Id not in 
(
SELECT top 1000 MAX(Id) AS Id
FROM yourtable
GROUP BY IdRow, IdAudience, IdAb, Quantity
ORDER BY Id
)

How to remove duplicate entries in SQL using selected columns only?

One method is group by:

select id, name, date, gender, country, min(vendor) as vendor
from t
group by id, name, date, gender, country;

This returns an "arbitrary" value of vendor. Tables in SQL represent unordered sets. There is no concept of 4th or 5th or 6th row. So, if you want one of the particular vendor values, you need to specify how that value is determined.

How to remove duplicates via a select if there are two deciding columns

You can use aggregation:

select min(id) as id, personid, location
from t
group by personid, location;

How to eliminate duplicates from select query?

If you don't need to keep different yrs, just use DISTINCT ON (FIELD_NAME)

SELECT DISTINCT ON (userID) userdID, name, yr FROM TABLE_NAME

How to remove duplicates based on a certain column in SQL Server?

We can use a deletable CTE along with ROW_NUMBER here:

WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY fid ORDER BY date DESC, name) rn
FROM yourTable
)

DELETE
FROM cte
WHERE rn > 1;

The above logic will assign rn = 1 (i.e. spare) the record with the most recent date, per group of fid records. Should two records with the same fid also have the same latest date, then it spares the earlier name.

Remove duplicate rows based on field in a select query with PostgreSQL?

Use DISTINCT ON:

SELECT DISTINCT ON (contenthash)
id,
contenthash,
filesize,
to_timestamp(timecreated) :: DATE
FROM mdl_files
ORDER BY contenthash, timecreated, id;

DISTINCT ON is a Postgres extension that makes sure that returns one row for each unique combination of the keys in parentheses. The specific row is the first one found based on the order by clause.



Related Topics



Leave a reply



Submit