Prevent duplicate values in LEFT JOIN
I like to call this problem "cross join by proxy". Since there is no information (WHERE
or JOIN
condition) how the tables department
and contact
are supposed to match up, they are cross-joined via the proxy table person
- giving you the Cartesian product. Very similar to this one:
- Two SQL LEFT JOINS produce incorrect result
More explanation there.
Solution for your query:
SELECT p.id, p.person_name, d.department_name, c.phone_number
FROM person p
LEFT JOIN (
SELECT person_id, min(department_name) AS department_name
FROM department
GROUP BY person_id
) d ON d.person_id = p.id
LEFT JOIN (
SELECT person_id, min(phone_number) AS phone_number
FROM contact
GROUP BY person_id
) c ON c.person_id = p.id;
You did not define which department or phone number to pick, so I arbitrarily chose the minimum. You can have it any other way ...
How does one use join in mysql and avoid duplicate entries in response
First, use distinct *
is counterintuitive, you are essentially selecting every row in the table then eliminating duplicate rows. Try to avoid using that.
since you have tried distinct
it eliminated the possibility where you start off with duplicate data in your tables.
looking at your screenshot I think the rows are not duplicate. They might be identical on certain columns but can't be completely identical. for example.
media:
id name
----------- ---------------
1 mediaA
2 mediaB
3 mediaC
media_creditsDATA:
media_id credit_id name
----------- ----------- ---------------
1 1 good credit
1 2 ok credit
2 3 bad credit
3 4 no credit
if you execute the following sql with distinct
or not the result is the same:
SELECT *
FROM media
INNER JOIN media_creditsDATA ON media.id = media_creditsDATA.media_id
result:
id name media_id credit_id name
----------- --------------- ----------- ----------- ---------------
1 mediaA 1 1 good credit
1 mediaA 1 2 ok credit
2 mediaB 2 3 bad credit
3 mediaC 3 4 no credit
If you only look at the first three columns in the result table then sure there are duplicate records, but not if you look at all the columns. As you can see the media table has a one to many relationship to media_creditsDATA table. The result table has records that share the same subset of columns but there are no duplicate records.
so I think the problem in this case is not how you join is how you filter your result. such as is there a subset of credit records you are looking for in media_creditsDATA table? or maybe you don't care and you just record with highest credit_id for each media records.
SELECT *
FROM media
INNER JOIN (
select media_id, max(credit_id) as highest_credit_id from media_creditsDATA
group by media_id )media_creditsDATA ON media.id = media_creditsDATA.media_id
you get:
id name media_id highest_credit_id
----------- --------------- ----------- --------------
1 mediaA 1 2
2 mediaB 2 3
3 mediaC 3 4
INSERT INTO SELECT with a LEFT JOIN to prevent duplicates, only prevents duplicates already in the table
You are correct on the "snapshot" point: any insertions into table1
in this query will not affect the LEFT JOIN table1
.
But you would still need a DISTINCT
to guarantee uniqueness from the queried data.
INSERT INTO table1
SELECT DISTINCT
t2.col1,
t2.col2
FROM table2 t2
LEFT JOIN table1 t1
ON t2.col1 = t1.col1
AND t2.col2 = t1.col2
WHERE t1.col1 IS NULL
However:
LEFT JOIN
is a poor man's replacement forNOT EXISTS
andEXCEPT
which the optimizer understands much better- You should always specify column names in an
INSERT
So your code should look like one of these options:
INSERT INTO table1 (col1, col2)
SELECT DISTINCT
t2.col1,
t2.col2
FROM table2 t2
WHERE NOT EXISTS (SELECT 1
FROM table1 t1
WHERE t2.col1 = t1.col1
AND t2.col2 = t1.col2);
INSERT INTO table1 (col1, col2)
SELECT DISTINCT
t2.col1,
t2.col2
FROM table2 t2
WHERE NOT EXISTS ( -- or you can use EXISTS/EXCEPT
SELECT t2.col1, t2.col2
INTERSECT
SELECT t1.col1, t1.col2
FROM table1 t1);
INSERT INTO table1 (col1, col2)
SELECT -- EXCEPT implies DISTINCT
t2.col1,
t2.col2
FROM table2 t2
EXCEPT
SELECT t1.col1, t1.col2
FROM table1 t1;
How to avoid duplicate records in left join
You need to aggregate t2
before joining:
SELECT t1.*, t2.City
FROM t1 LEFT JOIN
(SELECT t2.ID, ANY_VALUE(t2.City) as City
FROM t2
GROUP BY t2.ID)
) t2
ON t1.ID = t2.ID;
Related Topics
In SQL How to Get the Maximum Value for an Integer
How to Search SQL Column Containing JSON Array
SQL Server Search Using Like While Ignoring Blank Spaces
Return Number from Oracle Select Statement After Parsing Date
How to Get Rid of "Error 1329: No Data - Zero Rows Fetched, Selected, or Processed"
Grant Select Permission on a View, But Not on Underlying Objects
How to Search All Columns in a Table
Looping Through Column Names with Dynamic SQL
Spark Replacement for Exists and In
Incorrect Syntax Near the Keyword 'With'...Previous Statement Must Be Terminated with a Semicolon
Difference Between === Null and Isnull in Spark Datadrame
How to Use Alias in Where Clause
Postgresql Where Count Condition
How to Select Exists Directly as a Bit