Eliminating Duplicate Values Based on Only One Column of the Table

Eliminating duplicate values based on only one column of the table

This is where the window function row_number() comes in handy:

SELECT s.siteName, s.siteIP, h.date
FROM sites s INNER JOIN
     (select h.*, row_number() over (partition by siteName order by date desc) as seqnum
      from history h
     ) h
    ON s.siteName = h.siteName and seqnum = 1
ORDER BY s.siteName, h.date

Remove duplicates in Select query based on one column

You can also use ROW_NUMBER():

SELECT id, name
FROM (
    SELECT id, name, ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) rn 
    FROM mytable
) x
WHERE rn = 1

This will retain the record that has the smallest name (so '5d' will come before '5e'). With this technique, you can also use a sort criteria on another column that the one where duplicates exists (which an aggregate query with MIN() cannot do). Also, queries using window functions usually perform better than the equivalent aggregate query.

Remove duplicate values based on only one column of the table

Step#1 use table aliasses:

SELECT DISTINCT ta.ArtCode
        , ta.ArtOms
        , ldet.AldArtCodeBijLeverancier
        , ldet.AldInkoopPrijs
        , vrd.MagVoorraad
        , vrd.LocNaam
        , mut.VrdMutDatum
        , naw.NawFilNummer
FROM KingSystem.tabArtikel ta
        , KingSystem.tabArtikelLeverancier tal
        , KingSystem.tabArtikelLeverancierDetail ldet
        , KingSystem.tabNawFile naw
        , KingSystem.tabVoorraadMutatie mut
        , KingSystem.vwKMBVoorraad vrd
WHERE vrd.ArtGid = ta.ArtGid 
  AND tal.ArtLevArtGid = ta.ArtGid
  AND ldet.AldArtLevGid = tal.ArtLevGid
  AND mut.VrdMutArtGid = ta.ArtGid
  AND naw.NawFilNawGid = tal.ArtLevNawGid
  AND vrd.MagVoorraad>0 AND (ta.ArtCode Not Like 'V%' 
--      ORDER BY mut.VrdMutDatum DESC -- nonsens!
        ;

Step#2 : use JOIN-syntax (and remove the distinct):

SELECT -- DISTINCT
       ta.ArtCode
        , ta.ArtOms
        , ldet.AldArtCodeBijLeverancier
        , ldet.AldInkoopPrijs
        , vrd.MagVoorraad
        , vrd.LocNaam
        , mut.VrdMutDatum
        , naw.NawFilNummer
FROM KingSystem.tabArtikel ta
JOIN KingSystem.vwKMBVoorraad vrd ON vrd.ArtGid = ta.ArtGid
JOIN KingSystem.tabArtikelLeverancier tal ON tal.ArtLevArtGid = ta.ArtGid
JOIN KingSystem.tabArtikelLeverancierDetail ldet ON ldet.AldArtLevGid = tal.ArtLevGid
JOIN KingSystem.tabNawFile naw ON naw.NawFilNawGid = tal.ArtLevNawGid
JOIN KingSystem.tabVoorraadMutatie mut ON mut.VrdMutArtGid = ta.ArtGid
WHERE vrd.MagVoorraad>0 AND ta.ArtCode Not Like 'V%'
--      ORDER BY mut.VrdMutDatum DESC -- nonsens!
        ;

step#3 : add condition to suppress the old mutations:

SELECT -- DISTINCT
       ta.ArtCode
        , ta.ArtOms
        , ldet.AldArtCodeBijLeverancier
        , ldet.AldInkoopPrijs
        , vrd.MagVoorraad
        , vrd.LocNaam
        , mut.VrdMutDatum
        , naw.NawFilNummer
FROM KingSystem.tabArtikel ta
JOIN KingSystem.vwKMBVoorraad vrd
        ON vrd.ArtGid = ta.ArtGid
JOIN KingSystem.tabArtikelLeverancier tal
        ON tal.ArtLevArtGid = ta.ArtGid
JOIN KingSystem.tabArtikelLeverancierDetail ldet
        ON ldet.AldArtLevGid = tal.ArtLevGid
JOIN KingSystem.tabNawFile naw
        ON naw.NawFilNawGid = tal.ArtLevNawGid
JOIN KingSystem.tabVoorraadMutatie mut
        ON mut.VrdMutArtGid = ta.ArtGid
        AND NOT EXISTS( -- suppress older mutations; keeping only the most recent
        SELECT * FROM KingSystem.tabVoorraadMutatie mx
        WHERE mx.VrdMutArtGid = mut.VrdMutArtGid
        AND mx.VrdMutDatum > mut.VrdMutDatum
        )
WHERE vrd.MagVoorraad > 0 AND ta.ArtCode Not Like 'V%'
--      ORDER BY mut.VrdMutDatum DESC -- nonsens!
        ;

Voila! (that'll be approx 50 euro)

Step#4: reorder the columns, and add an order by clause

UPDATE: I removed the DISTINCT It servers no purpose, and can only hide possible errors (and introduce other,more serious errors)

remove duplicate values of only one column value from all the available columns in sql query

This will return 1 row for each value of beam_current:

;WITH CTE AS
(
SELECT
  row_number() over (partition by beam_current order by (select 1)) rn,
  beam_current, logtime, beam_energy 
FROM INDUS2_BDS.dbo.DCCT 
WHERE 
  logtime between '2014-08-09 01:13:03' and '2014-08-09 02:16:53'
  and (beam_current like '%9.96' or beam_current like '%9.97' 
       or beam_current like '%9.98' or  beam_current like '%9.99'
       or beam_current like '%0' or beam_current like '%_0.01' 
       or beam_current like '%_0.02' or beam_current like '%_0.03' 
       or beam_current like '%_0.04' or beam_current like '%_0.05' 
       or beam_current like '%_0.06') 
       and beam_energy between 550 and 552
)
SELECT beam_current, logtime, beam_energy 
FROM CTE
WHERE rn = 1

Remove duplicate rows based on values from one column

You can create a temporary table. In the below example this is called #newtable. The hashtag is important as this is actually what makes it a 'temporary' table (not everyone explains this).

The below might prove useful to others as it includes WHERE conditions which most examples do not have online:

-- First create your temp table 
SELECT CONVERT(DATE,a.ins_timestamp) AS 'Date',  
a.Prod_code, 
a.Curr_boxes, 
a.Label_barcode, 
b.From_ord_no, 
NULL AS To_ord_no, 
CASE
    WHEN a.From_batch >= a.To_batch THEN a.From_batch 
    WHEN a.To_batch >= a.From_batch THEN a.To_batch 
    ELSE                             a.From_batch
END AS 'Batch',  
a.Weight, 
'IN' AS 'Direction' 

INTO #newtable 

FROM a 

JOIN b ON a.Label_barcode = b.Label_barcode 

WHERE (a.ins_timestamp Between ? And ?) AND (a.To_batch = ?) AND (a.From_batch = 0) AND (a.Type='Consumption') AND (a.To_status<>'STOCK') AND (b.From_status = 'PORDER') 

-- Now we insert the second query into the already created table
    INSERT INTO #newtable 

SELECT CONVERT(DATE,b.ins_timestamp) AS 'Date',  
b.Prod_code, 
b.Curr_boxes, 
b.Label_barcode, 
NULL AS From_ord_no, 
NULL AS To_ord_no, 
CASE
    WHEN b.From_batch >= b.To_batch THEN b.From_batch 
    WHEN b.To_batch >= b.From_batch THEN b.To_batch 
    ELSE                             b.From_batch
END AS 'Batch',  
b.Weight, 
'IN' AS 'Direction' 

FROM b 

WHERE (b.From_batch = 0) AND (b.Type='Consumption') AND (b.ins_timestamp Between ? And ?) AND (b.To_batch = ?) AND (b.To_status<>'STOCK') 

-- Now we can select whatever we want from our temp table
SELECT Date,  
    Prod_code, 
    Curr_boxes, 
    Label_barcode, 
    max(From_ord_no) From_ord_no, 
    To_ord_no, 
    Batch,  
    Weight, 
    Direction 

FROM #newtable 

GROUP BY Date,  
    Prod_code, 
    Curr_boxes, 
    Label_barcode, 
    To_ord_no, 
    Batch,  
    Weight, 
    Direction

remove duplicate rows based on one column value

This is similar to Gordon Linoff's query, but without the subquery:

DELETE t1 FROM table t1
  JOIN table t2
  ON t2.refID = t1.refID
  AND t2.ID < t1.ID

This uses an inner join to only delete rows where there is another row with the same refID but lower ID.

The benefit of avoiding a subquery is being able to utilize an index for the search. This query should perform well with a multi-column index on refID + ID.

How to delete rows that have duplicate column combination

You can try to use delete JOIN

DELETE t1
FROM [Table] t1
INNER JOIN (
  SELECT Column1,
       Column2,
       max(RefDate) as MaxDate
  FROM [Table]
  GROUP BY Column1, Column2
) t2
  ON t1.Column1 = t2.Column1
 AND t1.Column2 = t2.Column2
 AND t1.RefDate <> t2.MaxDate

or use EXISTS subquery.

DELETE t1
FROM [Table] t1
WHERE EXISTS (
  SELECT 1
  FROM [Table] t2
  WHERE t1.Column1 = t2.Column1
  AND t1.Column2 = t2.Column2
  HAVING max(t2.RefDate) <> t1.RefDate
)

sqlfiddle

Eliminating Duplicate Values Based on Only One Column of the Table