Getting Extra Rows - After Joing the 3 Tables Using Left Join

How can a LEFT OUTER JOIN return more records than exist in the left table?

The LEFT OUTER JOIN will return all records from the LEFT table joined with the RIGHT table where possible.

If there are matches though, it will still return all rows that match, therefore, one row in LEFT that matches two rows in RIGHT will return as two ROWS, just like an INNER JOIN.

EDIT:
In response to your edit, I've just had a further look at your query and it looks like you are only returning data from the LEFT table. Therefore, if you only want data from the LEFT table, and you only want one row returned for each row in the LEFT table, then you have no need to perform a JOIN at all and can just do a SELECT directly from the LEFT table.

Mysql LEFT JOIN of three tables returns to many Rows

You are getting the correct result with the 12 records becuase that is the correct tuple with the way you are asking for the data. I am not sure why you are joinming these 3 tables together becuase inherently, the 2 related tables are not the same type of data. What I would suggest is that you select person & movies and then you can union person & artists, becuase your union will want the columns to be the same, i would suggest adding a type to differentiate from artists and movies and then the nice name should just be AS a string_value

Left Join without duplicate rows from left table

Try an OUTER APPLY

SELECT 
C.Content_ID,
C.Content_Title,
C.Content_DatePublished,
M.Media_Id
FROM
tbl_Contents C
OUTER APPLY
(
SELECT TOP 1 *
FROM tbl_Media M
WHERE M.Content_Id = C.Content_Id
) m
ORDER BY
C.Content_DatePublished ASC

Alternatively, you could GROUP BY the results

SELECT 
C.Content_ID,
C.Content_Title,
C.Content_DatePublished,
M.Media_Id
FROM
tbl_Contents C
LEFT OUTER JOIN tbl_Media M ON M.Content_Id = C.Content_Id
GROUP BY
C.Content_ID,
C.Content_Title,
C.Content_DatePublished,
M.Media_Id
ORDER BY
C.Content_DatePublished ASC

The OUTER APPLY selects a single row (or none) that matches each row from the left table.

The GROUP BY performs the entire join, but then collapses the final result rows on the provided columns.

JOIN three tables resulting in duplicate records

When aggregating from more than one table, then aggregate before joining:

SELECT 
l.*,
i._items,
lt.taxonomy
FROM lists l
JOIN
(
select list_id, json_agg(list_taxonomies.* order by type) AS taxonomy
from list_taxonomies
group by list_id
) lt ON lt.list_id = l.id
JOIN
(
select list_id, json_agg(items.* order by id) AS _items
from items
group by list_id
) i ON i.list_id = l.id
WHERE l.id = 3;

How to join 3 tables without losing rows?

Based on your posted sample data for the three tables, you can use a couple of LEFT JOINs to ensure that all of the iditem values are preserved. Consequently, when joined tables don't sync up with tables to their left, you'll see some NULL values.

SELECT `item`.`iditem`, 
COALESCE(MAX(`bid`.`amount`), 0) AS `amount`,
`item`.`description`,
`item`.`min_price`,
`seller`.`name`
FROM `item`
LEFT JOIN `seller` ON `seller`.`idseller` = `item`.`idseller`
LEFT JOIN `bid` ON `bid`.`iditem` = `item`.`iditem`
WHERE `item`.`idcategory` IS NULL /* because you didn't give this bit of data, nor the expiry */
GROUP BY `item`.`iditem`;

Here's my sqlfiddle demo to prove the query's success on your sample data.

When performing JOINs you should include an ON clause that determines which columns are acting as the glue.

The WHERE clause is now at your disposal to configure to your project requirements. Be sure to clarify which table certain columns come from so that you don't get errors based on ambiguity.

Backtick-wrapping tablenames and columns may be overkill (and some devs don't like the bloat) but I find this to be sensible in eliminating the possibility of accidentally using a RESERVED mysql word.

Why do 3 tables join produce duplicate rows?

Created sample data with 3 table Vehicletype, VehicleOwner, VehicleInformation in which typeID is PK.

/* Create a table */

CREATE TABLE Vehicletype(Id integer PRIMARY KEY, Name text);
CREATE TABLE VehicleOwner(OwnerId integer, InfoID integer, TypeId integer, Name text);
CREATE TABLE VehicleInformation(InfoId integer, OwnerId integer, TypeId integer, INfo text);

/* Create few records in Vehicletype table */

INSERT INTO Vehicletype VALUES(1,'TYPE1');
INSERT INTO Vehicletype VALUES(2,'TYPE2');
INSERT INTO Vehicletype VALUES(3,'TYPE3');

/* Create few records in VehicleOwner table */

INSERT INTO VehicleOwner VALUES(1,1,1,'NAME1');
INSERT INTO VehicleOwner VALUES(2,2,2,'NAME2');
INSERT INTO VehicleOwner VALUES(3,3,3,'NAME3');
INSERT INTO VehicleOwner VALUES(4,4,1,'NAME4');
INSERT INTO VehicleOwner VALUES(5,5,2,'NAME5');
INSERT INTO VehicleOwner VALUES(6,6,3,'NAME6');
INSERT INTO VehicleOwner VALUES(7,7,1,'NAME7');

/* Create few records in VehicleInformation table */

INSERT INTO VehicleInformation VALUES(1,1,1,'INFO1');
INSERT INTO VehicleInformation VALUES(2,2,2,'INFO2');
INSERT INTO VehicleInformation VALUES(3,3,3,'INFO3');
INSERT INTO VehicleInformation VALUES(4,4,1,'INFO4');
INSERT INTO VehicleInformation VALUES(5,5,2,'INFO5');
INSERT INTO VehicleInformation VALUES(6,6,3,'INFO6');
INSERT INTO VehicleInformation VALUES(7,7,1,'INFO7');

COMMIT;

/* Display all the records from the table */

SELECT * FROM Vehicletype;
SELECT * FROM VehicleOwner;
SELECT * FROM VehicleInformation;

This join will give you the unique result from your data.

select *
from Vehicletype vt,VehicleOwner vo, VehicleInformation vi
where 1=1
and vt.id=vo.typeid
and vt.id=vi.typeid
and vo.ownerid=vi.ownerid
and vo.infoid=vi.infoid
and vo.typeid=vi.typeid;

Left join for 3 tables with Where not working as expected

Applying a condition on the "outer joined" table in the WHERE clause effectively turns the outer join into an inner join, because every row that is retained by the outer join will contain a null value in that column, but the condition c.Indicator = 'Y' in the where clause will remove those rows again.

To fix this, move c.Indicator = 'Y' into the join condition:

SELECT a.ID, b.Number, a.Version, c.Name, c.Indicator
FROM Version a
LEFT JOIN Cell b ON a.ID = b.ID
LEFT JOIN Names c ON a.ID = c.ID AND c.Indicator = 'Y'
WHERE a.Version LIKE '1%'

Duplicate rows when joining three tables

What is happening is that joining Report (1 row) to ClothingObservation (10 rows) produces 10 row (1 x 10), you then join to HygieneObservation (10 rows) which gives you 100. The reason this is happening is because after the initial join you have 10 rows with the same ReportID so the next join takes each of these 10 rows and joins to the 10 rows in HygieneObservation.

The solution for "20 rows with NULL values":

SELECT 
Report.ReportId,
Report.Period,
Report.Reporter,
Report.DepartmentId,
ClothingObservation.ClothingObservationId,
NULL AS HygieneObservationId
FROM Report
LEFT JOIN ClothingObservation ON
(ClothingObservation.ReportId = Report.ReportId)
UNION ALL
SELECT
Report.ReportId,
Report.Period,
Report.Reporter,
Report.DepartmentId,
NULL AS ClothingObservationId,
HygieneObservation.HygieneObservationId
FROM Report
LEFT JOIN HygieneObservation ON
(HygieneObservation.ReportId = Report.ReportId)

How it works:

You essentially write two separate queries: one that join Report and ClothingObservation and another that joins Report to HygieneObservation. You then combine the two queries with UNION ALL.

The solution for "get 10 rows"

This is complex as it involves what I call "vertical merging" or "Merge Join". Below is the query (Update: I have tested it).

SELECT 
Report.ReportId,
Report.Period,
Report.Reporter,
Report.DepartmentId,

MergedObservations.ClothingObservationId,
MergedObservations.HygieneObservationId
FROM Report
LEFT JOIN
( SELECT COALESCE( ClothingObservation.ReportID, HygieneObservation.ReportID ) AS ReportID,
HygieneObservationID, ClothingObservationID -- Add appropriate columns
FROM
( SELECT ROW_NUMBER() OVER( PARTITION BY ReportID ORDER BY ClothingObservationID ) AS ResultID, ReportID, ClothingObservationID
FROM ClothingObservation ) AS ClothingObservation
FULL OUTER JOIN
( SELECT ROW_NUMBER() OVER( PARTITION BY ReportID ORDER BY HygieneObservationID ) AS ResultID, ReportID, HygieneObservationID
FROM HygieneObservation ) AS HygieneObservation
ON ClothingObservation.ReportID = HygieneObservation.ReportID
AND ClothingObservation.ResultID = HygieneObservation.ResultID
) AS MergedObservations
ON Report.ReportID = MergedObservations.ReportID

How it works:

Because ClothingObservation and HygieneObservationId are not directly related to each other and have differing number of rows per ReportID, I use a ROW_NUMBER() function to generate a join key. I then do a "Merge Join" using ReportID and the output of the ROW_NUMBER() function.

Sample Data

I have converted your sample data into a usable table data to test above queries.

CREATE TABLE Report( ReportId INT, Period DATETIME, Reporter VARCHAR( 20 ), DepartmentId INT )
CREATE TABLE ClothingObservation( ClothingObservationID INT, ReportId INT )
CREATE TABLE HygieneObservation( HygieneObservationID INT, ReportId INT )

INSERT INTO Report
VALUES( 1, '2016-05-01', 'username', 1 )

INSERT INTO ClothingObservation
VALUES
( 1, 1 ), ( 2, 1 ), ( 3, 1 ), ( 4, 1 ), ( 5, 1 ), ( 6, 1 ), ( 7, 1 ), ( 8, 1 ), ( 9, 1 ), ( 10, 1 )

INSERT INTO HygieneObservation
VALUES
( 3, 1 ), ( 4, 1 ), ( 5, 1 ), ( 6, 1 ), ( 7, 1 ), ( 8, 1 ), ( 9, 1 ), ( 10, 1 ), ( 11, 1 ), ( 12, 1 ), ( 13, 1 )


Related Topics



Leave a reply



Submit