Removing duplicates from a SQL query (not just use distinct)
Arbitrarily choosing to keep the minimum PIC_ID. Also, avoid using the implicit join syntax.
SELECT U.NAME, MIN(P.PIC_ID)
FROM USERS U
INNER JOIN POSTINGS P1
ON U.EMAIL_ID = P1.EMAIL_ID
INNER JOIN PICTURES P
ON P1.PIC_ID = P.PIC_ID
WHERE P.CAPTION LIKE '%car%'
GROUP BY U.NAME;
How to select records without duplicate on just one field in SQL?
Try this:
SELECT MIN(id) AS id, title
FROM tbl_countries
GROUP BY title
SQL to remove duplicate records without a distinct
Here's one way using a window function to get the count of matching records:
SELECT
columnA,
columnB,
columnC
FROM
(
SELECT
columnA,
columnB,
columnC,
COUNT(*) OVER (PARTITION BY columnA, columnB) as rcount
FROM table
) sub
WHERE
(sub.rcount = 2 AND columnC = 'John')
OR sub.rcount = 1;
ORACLE SQL select distinct not removing duplicates
You misunderstand what distinct
is. It is not a function. It is a modifier on select
and it affects all columns being selected. So, it is behaving exactly as it should.
If you want aggregations by zip code and week, then those are the only two columns that should be in the group by
:
SELECT vo.ZIP_CODE, TO_CHAR(ca.CALENDAR_WEEK),
-- vo.REGION_ID
COUNT(vo.ORDER_ID),
SUM(vo.AMOUNT)
FROM VENDOR_ORDERS vo JOIN
CALENDAR ca
ON TRUNC(vo.ORDER_CREATION_DATETIME) = sd.CALENDAR_DATE
WHERE vo.REGION_ID = 1
GROUP BY vo.ZIP_CODE, TO_CHAR(ca.CALENDAR_WEEK)
You could probably include region_id
as well, assuming that each zip code is in one region.
DISTINCT does not remove duplicates
First of all this query works good because the seeria_nr
and paigalduse_aeg
is different as you can see so DISTINCT
cannot filter out them.
You can use GROUP BY
to get what you want:
GROUP BY
b.kasutaja_nimi
,b.eesnimi
,b.perenimi
,a.r_nimetus
this will brings to you the result that you execept - but remeber that seeria_nr
and paigalduse_aeg
will be showing randomly values.
How do I delete all the duplicate records in a MySQL table without temp tables
Add Unique Index on your table:
ALTER IGNORE TABLE `TableA`
ADD UNIQUE INDEX (`member_id`, `quiz_num`, `question_num`, `answer_num`);
Another way to do this would be:
Add primary key in your table then you can easily remove duplicates from your table using the following query:
DELETE FROM member
WHERE id IN (SELECT *
FROM (SELECT id FROM member
GROUP BY member_id, quiz_num, question_num, answer_num HAVING (COUNT(*) > 1)
) AS A
);
Delete duplicate rows from a BigQuery table
You can remove duplicates by running a query that rewrites your table (you can use the same table as the destination, or you can create a new table, verify that it has what you want, and then copy it over the old table).
A query that should work is here:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY Fixed_Accident_Index)
row_number
FROM Accidents.CleanedFilledCombined
)
WHERE row_number = 1
How to remove duplicates in query for google big query by a subset of returned rows, and keep first?
As @Jaytiger has mentioned in the comments, we have to use the ROW_NUMBER()
function coupled with PARTITION BY
and ORDER BY
clauses.
Consider the query below. I have tested the query on sample data and have compared the results with that of a pandas snippet.
SELECT * from
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY column1, column6 ORDER BY columnX) row_num
FROM
`<project-id>.test_dataset.keep_first_in_duplicate`
)
where row_num=1
The usage of the ORDER BY
clause depends on the requirement, the requirement being order preservation of the input data. Unlike a pandas dataframe, the order of input data is not preserved in BigQuery. If we wish to preserve the order, we have to have a new column with indices that can be used to sort the data after ingesting into BigQuery. In summary, if your data source follows a certain order, there will be differences between the deduplication output from BigQuery and that of the pandas dataframe.
Related Topics
Add a SQL Xor Constraint Between Two Nullable Fk'S
How to Use the Results of a Stored Procedure from Within Another
SQL Server Conditional Order By
SQL Server 2005: How to Subtract 6 Month
Auto Increment on Composite Primary Key
"Invalid Column Name" Error on SQL Statement from Openquery Results
Bulk Load Data Conversion Error (Truncation)
Preserve SQL Indexes While Altering Column Datatype
How to Expand a "Condensed" Postgresql Row into Separate Columns
SQL Query Aggregate May Not Appear in Where Clause
Get Last Friday's Date Unless Today Is Friday Using T-Sql
How to Match Multiple Column in a Table with SQLite Fts3
Returning Multiple Rows from Querying Xml Column in SQL Server 2008
How to Enable Integration Services (Ssis) in SQL Server 2008
SQL Group by Day, Show Orders for Each Day
Which SQL Command How to Use to See the Structure of a Table on SQL Server