Two SQL Left Joins Produce Incorrect Result

Left join then inner join incorrect result using sql?

Joins are "completed" in the order in which their ON clauses exist. Joins aren't just between tables - they're frequently between results produced from other joins. So:

select * from #temp t
left join #table_link tl on tl.temp_id=t.id
inner join #temp2 t2 on t2.id=tl.temp2_id and t2.active_tag=1

First performs a left join between t and tl. This produces a new result set (lets call it ttl) which contains rows from t and may contain joined rows from tl or nulls (since its a left join). We then join that result (ttl) to t2, performing an inner join. However, we know that ttl may contain nulls for columns that originated from tl and so the inner join will fail.

What we can instead write is:

select * from #temp t
left join #table_link tl
inner join #temp2 t2
on t2.id=tl.temp2_id and t2.active_tag=1
on tl.temp_id=t.id

And note that I've now (by moving on clauses around) changed the order of the joins. We're now first performing an inner join between tl and t2 (lets call it t2tl) and then performing a left join between t and t2tl. This means that the left join acts last and so we can again get null results in the final result.


To see how joins and on clauses act, think of the join word as being a ( and the on clause as being ). Then you find the joins that on clauses relate to as finding matching ()s.

MySql: Multiple Left Join giving wrong output

You need to flatten the results of your query, in order to obtain a right count.

You said you have one-to-many relationship from your files table to other table(s)

If SQL only has a keyword LOOKUP instead of cramming everything in JOIN keywords, it shall be easy to infer if the relation between table A and table B is one-to-one, using JOIN will automatically connotes one-to-many. I digress. Anyway, I should have already inferred that your files is one-to-many against dm_data; and also, the files against kc_data is one-to-many too. LEFT JOIN is another hint that the relationship between first table and second table is one-to-many; this is not definitive though, some coders just write everything with LEFT JOIN. There's nothing wrong with your LEFT JOIN in your query, but if there are multiple one-to-many tables in your query, that will surely fail, your query will produce repeating rows against other rows.

from
files
left join
dm_data ON dm_data.id = files.id
left join
kc_data ON kc_data.id = files.id

So with this knowledge that you indicate files is one-to-many against dm_data, and it is one-to-many also against kc_data. We can conclude that there's something wrong with chaining those joins and grouping them on one monolithic query.

An example if you have three tables, namely app(files), ios_app(dm_data), android_app(kc_data), and this is the data for example for ios:

test=# select * from ios_app order by app_code, date_released;
ios_app_id | app_code | date_released | price
------------+----------+---------------+--------
1 | AB | 2010-01-01 | 1.0000
3 | AB | 2010-01-03 | 3.0000
4 | AB | 2010-01-04 | 4.0000
2 | TR | 2010-01-02 | 2.0000
5 | TR | 2010-01-05 | 5.0000
(5 rows)

And this is the data for your android:

test=# select * from android_app order by app_code, date_released;
.android_app_id | app_code | date_released | price
----------------+----------+---------------+---------
1 | AB | 2010-01-06 | 6.0000
2 | AB | 2010-01-07 | 7.0000
7 | MK | 2010-01-07 | 7.0000
3 | TR | 2010-01-08 | 8.0000
4 | TR | 2010-01-09 | 9.0000
5 | TR | 2010-01-10 | 10.0000
6 | TR | 2010-01-11 | 11.0000
(7 rows)

If you merely use this query:

select x.app_code, 
count(i.date_released) as ios_release_count,
count(a.date_released) as android_release_count
from app x
left join ios_app i on i.app_code = x.app_code
left join android_app a on a.app_code = x.app_code
group by x.app_code
order by x.app_code

The output will be wrong instead:

 app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
AB | 6 | 6
MK | 0 | 1
PM | 0 | 0
TR | 8 | 8
(4 rows)

You can think of chained joins as cartesian product, so if you have 3 rows on first table, and has 2 rows on second table, the output will be 6

Here's the visualization, see that there is 2 repeating android AB for every ios AB. There are 3 ios AB, so what would be the count when you do COUNT(ios_app.date_released)? That will become 6; the same with COUNT(android_app.date_released), this will also be 6. Likewise there's 4 repeating android TR for every ios TR, there are are 2 TR in ios, so that would give us a count of 8.

.app_code | ios_release_date | android_release_date 
----------+------------------+----------------------
AB | 2010-01-01 | 2010-01-06
AB | 2010-01-01 | 2010-01-07
AB | 2010-01-03 | 2010-01-06
AB | 2010-01-03 | 2010-01-07
AB | 2010-01-04 | 2010-01-06
AB | 2010-01-04 | 2010-01-07
MK | | 2010-01-07
PM | |
TR | 2010-01-02 | 2010-01-08
TR | 2010-01-02 | 2010-01-09
TR | 2010-01-02 | 2010-01-10
TR | 2010-01-02 | 2010-01-11
TR | 2010-01-05 | 2010-01-08
TR | 2010-01-05 | 2010-01-09
TR | 2010-01-05 | 2010-01-10
TR | 2010-01-05 | 2010-01-11
(16 rows)

So what you should do is flatten each result before you join them to other tables and queries.

If your database is capable of CTE, please use so. It's very neat and very self-documenting:

with ios_app_release_count_list as
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
)
,android_release_count_list as
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
)
select
x.app_code,
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join ios_app_release_count_list i on i.app_code = x.app_code
left join android_release_count_list a on a.app_code = x.app_code
order by x.app_code;

Whereas if your database has no CTE capability yet, like MySQL, you should do this instead:

select x.app_code, 
coalesce(i.ios_release_count,0) as ios_release_count,
coalesce(a.android_release_count,0) as android_release_count
from app x
left join
(
select app_code, count(date_released) as ios_release_count
from ios_app
group by app_code
) i on i.app_code = x.app_code
left join
(
select app_code, count(date_released) as android_release_count
from android_app
group by app_code
) a on a.app_code = x.app_code
order by x.app_code

That query and the CTE-style query will show the correct output:

 app_code | ios_release_count | android_release_count 
----------+-------------------+-----------------------
AB | 3 | 2
MK | 0 | 1
PM | 0 | 0
TR | 2 | 4
(4 rows)

Live test

Incorrect query: http://www.sqlfiddle.com/#!2/9774a/2

Correct query: http://www.sqlfiddle.com/#!2/9774a/1

Multiple joins and aggregate functions producing incorrect totals

Aggregate before joining:

SELECT f.category,
COALESCE(ts.cnt, 0) + (COALESCE(ls.cnt, 0) as count,
COALESCE(ts.at_risk, 0) + (COALESCE(ls.at_risk, 0) as at_risk,
COALESCE(ts.danger, 0) + (COALESCE(ls.danger, 0) as danger
FROM features f LEFT JOIN
layer l
ON l.id = f.layer_id LEFT JOIN
(SELECT ts.fid, COUNT(*) as cnt,
COUNT(*) FILTER (WHERE ts.status = 'AT_RISK') as at_risk,
COUNT(*) FILTER (WHERE ts.status = 'DANGER') as danger
FROM town_status ts
GROUP BY ts.fid
) ts
ON town_status.fid = f.fid
(SELECT ls.fid, COUNT(*) as cnt,
COUNT(*) FILTER (WHERE ls.status = 'AT_RISK') as at_risk,
COUNT(*) FILTER (WHERE ls.status = 'DANGER') as danger
FROM landmark_status ls
GROUP BY ls.fid
) ls
ON ls.fid = f.fid;
WHERE l.location = 'Bristol'
GROUP BY f.category
ORDER BY f.category;

Query with multiple left joins - points column value is incorrect

I ran by first doing a pre-query aggregate of your points per specific class, then used left-join to it. I am getting more rows in the result set than your sample expected, but don't have MySQL to test/confirm directly. Howeverhere is a SQLFiddle of your query By doing your query with sum of points, and having a Cartesian result when applying the users table, it is probably the basis of duplicating the points. By pre-querying on the redeem codes itself, you just grab that value, then join to users.

SELECT
t.classroom_id,
title,
COALESCE ( r.classRewards, 0 ) AS totalRewards,
COALESCE ( r.classPoints, 0) AS totalPoints,
COALESCE ( r.uniqStudents, 0 ) as totalUniqRedeemStudents,
COALESCE ( COUNT(DISTINCT ocm.user_id), 0 ) AS totalStudents
FROM
organisation_classrooms t
LEFT JOIN ( select crc.classroom_id,
COUNT( DISTINCT crc.redeemed_code_id ) AS classRewards,
COUNT( DISTINCT crc.myuser_id ) as uniqStudents,
SUM( crc.points ) as classPoints
from classroom_redeemed_codes crc
JOIN organisation_classrooms t
ON crc.classroom_id = t.classroom_id
AND t.organisation_id = 37383
where crc.inactive = 0
AND ( crc.date_redeemed >= 1393286400
OR crc.date_redeemed = 0 )
group by crc.classroom_id ) r
ON t.classroom_id = r.classroom_id

LEFT OUTER JOIN organisation_classrooms_myusers ocm
ON t.classroom_id = ocm.classroom_id
WHERE
t.organisation_id = 37383
GROUP BY
title
ORDER BY
t.classroom_id ASC
LIMIT 10


Related Topics



Leave a reply



Submit