Why and When a Left Join with Condition in Where Clause Is Not Equivalent to the Same Left Join in On

Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON?

The on clause is used when the join is looking for matching rows. The where clause is used to filter rows after all the joining is done.

An example with Disney toons voting for president:

declare @candidates table (name varchar(50));
insert @candidates values
('Obama'),
('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values
('Mickey Mouse', 'Romney'),
('Donald Duck', 'Obama');

select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
and v.voter = 'Donald Duck'

This still returns Romney even though Donald didn't vote for him. If you move the condition from the on to the where clause:

select  *
from @candidates c
left join
@votes v
on c.name = v.voted_for
where v.voter = 'Donald Duck'

Romney will no longer be in the result set.

SQL INNER JOIN vs LEFT JOIN with a WHERE

Yes, they will return the same result. The left join without the where clause would read as show me all the records from the header table and the related items from the details table or null for the details where there are no matches.

Adding a where clause relating the ids effectively transforms the left join to an inner join by eliminating the non-matching rows that would have shown up as having null for the detail part.

In some databases, like MS SQL Server, the left join would show up as an inner join in the query execution plan.

Although you stated that you don't want Venn diagrams I can't help referring you to this question and its answers even though they are filled with (in my opinion very helpful) Venn diagrams.

Why did my WHERE clause affect my LEFT JOIN?


I was taught that, during the SQL's query processing, the JOIN clause would run before the WHERE clause, ensuring that the latter would look through the joined result.

That's the correct description of the SQL semantics, so what you're seeing is most likely a bug.

The actual implementation of an RDBMS is more complex. At a high level, the SQL query is parsed into a logical query plan, which is a tree that closely follows the structure of the input SQL. The optimizer is then responsible for converting the logical plan to the actual steps (physical operators) that will run to produce the result.

The logical plan of your query will be something like:

read MAIN_TABLE        read PRODUCTS
\ /
join them on MAIN_TABLE.code = PRODUCTS.code
|
apply filter MAIN_TABLE.code LIKE '%ABC%'

The optimizer's job is to figure out the efficient way to execute this. It can do transformations like predicate pushdown, where the filter (MAIN_TABLE.code LIKE '%ABC%') is pushed to the "read" stage, so that only relevant rows are read. Then the optimizer can decide on the physical operation it will use to read the input table (e.g. full-scan vs index-based reads).

(This is speculation on my part.) The optimizer could also notice that since you're joining on code, only the PRODUCTS that satisfy PRODUCTS.code LIKE '%ABC%' can be matched, so it could push down the predicate to the PRODUCTS scan operator as well. Depending on the collation on the input tables, if the optimizer is not very careful, the semantics of the LIKE '%ABC%' predicate could change, resulting in the behavior you're seeing.

left join and where condition in joining condition

You should not use column related to left table in where condition (this work as a INNER JOIN) move the condition for left join in the related ON clause

 select *  
FROM table1 t1
left join table2 t2
ON t1.id = t2.fk_id AND t2.id_number = 12174
WHERE t1.code = 'CODE1' ;

The where condition is the equivalent part of the INNER JOIN clause this is the reason that you have this behavior..

adding the condition to the on clause mean that also the added condition work as an outer join ..

SQL JOIN - WHERE clause vs. ON clause

They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.

Left Join With Where Clause

The where clause is filtering away rows where the left join doesn't succeed. Move it to the join:

SELECT  `settings`.*, `character_settings`.`value`
FROM `settings`
LEFT JOIN
`character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1'

Rows are eliminated from LEFT JOIN when using OR in WHERE clause

in this query:

SELECT *
FROM #LeftTable lt
LEFT OUTER JOIN #RightTable rt ON lt.PolicyRef = rt.PolicyRef
WHERE lt.RunID = rt.RunID OR rt.runid IS NULL

this part

SELECT *
FROM #LeftTable lt
LEFT OUTER JOIN #RightTable rt ON lt.PolicyRef = rt.PolicyRef

will give you 3 result:

100,'pol1','hi',80,'pol1','celec'

100,'pol2','hi2',90,'pol2','colorado'

100,'pol2','hi2',100,'pol2','colorado'

but the where statement want that set of result have the same id, so this is the only possible result:

100,'pol2','hi2',100,'pol2','colorado'

Why would LEFT JOIN on a field to then later filter it out in WHERE clause?

The query you have in the question is basically equivalent to the following query:

SELECT ID, Name, Phone 
FROM Table1
WHERE NOT EXISTS
(
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)

Meaning it selects all the records in Table1 that does not have a correlated record in Table2.

The execution plan for both queries will most likely be the same (Personally, I've never seen a case when they produce a different execution plan, but I don't rule that out), so both queries should be equally efficient, and it's up to you to decide whether the left join or the exists syntax is more readable to you.

sql left join criteria in join vs where clause


  1. If they can have multiple alerts, theoretically. However since you are excluding payments with alerts, this should not be a problem. If you were including them it could be. If this was a problem, you should use a "not in" subquery instead of left outer join since that can cause duplicate records if it's not 1:1.

  2. Having criteria in the where clause excludes the entire row if it doesn't match the criteria. Having it in the join clause means the joined record is not shown but the "parent" is.



Related Topics



Leave a reply



Submit