Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON?
The on
clause is used when the join
is looking for matching rows. The where
clause is used to filter rows after all the joining is done.
An example with Disney toons voting for president:
declare @candidates table (name varchar(50));
insert @candidates values
('Obama'),
('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values
('Mickey Mouse', 'Romney'),
('Donald Duck', 'Obama');
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
and v.voter = 'Donald Duck'
This still returns Romney
even though Donald
didn't vote for him. If you move the condition from the on
to the where
clause:
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
where v.voter = 'Donald Duck'
Romney
will no longer be in the result set.
SQL INNER JOIN vs LEFT JOIN with a WHERE
Yes, they will return the same result. The left join without the where clause would read as show me all the records from the header table and the related items from the details table or null for the details where there are no matches.
Adding a where clause relating the ids effectively transforms the left join to an inner join by eliminating the non-matching rows that would have shown up as having null for the detail part.
In some databases, like MS SQL Server, the left join would show up as an inner join in the query execution plan.
Although you stated that you don't want Venn diagrams I can't help referring you to this question and its answers even though they are filled with (in my opinion very helpful) Venn diagrams.
Why did my WHERE clause affect my LEFT JOIN?
I was taught that, during the SQL's query processing, the JOIN clause would run before the WHERE clause, ensuring that the latter would look through the joined result.
That's the correct description of the SQL semantics, so what you're seeing is most likely a bug.
The actual implementation of an RDBMS is more complex. At a high level, the SQL query is parsed into a logical query plan, which is a tree that closely follows the structure of the input SQL. The optimizer is then responsible for converting the logical plan to the actual steps (physical operators) that will run to produce the result.
The logical plan of your query will be something like:
read MAIN_TABLE read PRODUCTS
\ /
join them on MAIN_TABLE.code = PRODUCTS.code
|
apply filter MAIN_TABLE.code LIKE '%ABC%'
The optimizer's job is to figure out the efficient way to execute this. It can do transformations like predicate pushdown, where the filter (MAIN_TABLE.code LIKE '%ABC%'
) is pushed to the "read" stage, so that only relevant rows are read. Then the optimizer can decide on the physical operation it will use to read the input table (e.g. full-scan vs index-based reads).
(This is speculation on my part.) The optimizer could also notice that since you're joining on code
, only the PRODUCTS that satisfy PRODUCTS.code LIKE '%ABC%'
can be matched, so it could push down the predicate to the PRODUCTS scan operator as well. Depending on the collation on the input tables, if the optimizer is not very careful, the semantics of the LIKE '%ABC%'
predicate could change, resulting in the behavior you're seeing.
left join and where condition in joining condition
You should not use column related to left table in where condition (this work as a INNER JOIN) move the condition for left join in the related ON clause
select *
FROM table1 t1
left join table2 t2
ON t1.id = t2.fk_id AND t2.id_number = 12174
WHERE t1.code = 'CODE1' ;
The where condition is the equivalent part of the INNER JOIN clause this is the reason that you have this behavior..
adding the condition to the on clause mean that also the added condition work as an outer join ..
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
. The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
Left Join With Where Clause
The where
clause is filtering away rows where the left join
doesn't succeed. Move it to the join:
SELECT `settings`.*, `character_settings`.`value`
FROM `settings`
LEFT JOIN
`character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1'
Rows are eliminated from LEFT JOIN when using OR in WHERE clause
in this query:
SELECT *
FROM #LeftTable lt
LEFT OUTER JOIN #RightTable rt ON lt.PolicyRef = rt.PolicyRef
WHERE lt.RunID = rt.RunID OR rt.runid IS NULL
this part
SELECT *
FROM #LeftTable lt
LEFT OUTER JOIN #RightTable rt ON lt.PolicyRef = rt.PolicyRef
will give you 3 result:
100,'pol1','hi',80,'pol1','celec'
100,'pol2','hi2',90,'pol2','colorado'
100,'pol2','hi2',100,'pol2','colorado'
but the where statement want that set of result have the same id, so this is the only possible result:
100,'pol2','hi2',100,'pol2','colorado'
Why would LEFT JOIN on a field to then later filter it out in WHERE clause?
The query you have in the question is basically equivalent to the following query:
SELECT ID, Name, Phone
FROM Table1
WHERE NOT EXISTS
(
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)
Meaning it selects all the records in Table1 that does not have a correlated record in Table2.
The execution plan for both queries will most likely be the same (Personally, I've never seen a case when they produce a different execution plan, but I don't rule that out), so both queries should be equally efficient, and it's up to you to decide whether the left join or the exists syntax is more readable to you.
sql left join criteria in join vs where clause
If they can have multiple alerts, theoretically. However since you are excluding payments with alerts, this should not be a problem. If you were including them it could be. If this was a problem, you should use a "not in" subquery instead of left outer join since that can cause duplicate records if it's not 1:1.
Having criteria in the where clause excludes the entire row if it doesn't match the criteria. Having it in the join clause means the joined record is not shown but the "parent" is.
Related Topics
How to Query a Comma Separated Column for a Specific Value
Db Design to Use Sub-Type or Not
How to Count Instances of Character in SQL Column
Join Comma Delimited Data Column
Insert Update Stored Proc on SQL Server
Convert Unixtime to Datetime SQL (Oracle)
How to Make a Composite Key with SQL Server Management Studio
Postgresql Multi Insert...Returning with Multiple Columns
The New Pivot Function in Bigquery
Update One Table with Data from Another
Scope of Temporary Tables in SQL Server
What Does a Transaction Around a Single Statement Do
How to Pivot Rows into Columns (Custom Pivoting)
Passing Table and Column Name Dynamically Using Bind Variables