T-SQL - Left Outer Joins - Filters in the where clause versus the on clause
If you filter the left outer joined table in the WHERE clause then you are in effect creating an inner join
See also this wiki page: WHERE conditions on a LEFT JOIN
SQL left join with filter in JOIN condition vs filter in WHERE clause
The big difference with the Where condition b.status is null or b.status in (10, 100)
is when b.status is say 1 as well as b.id=a.id
In the first query you will still get the row from table A with corresponding B part as NULL as On condition is not fully satisfied.
In the second query you will get the row in the JOIN for both a and b tables which will be lost in the where clause.
SQL Outer Join Filtering Conditions in ON versus WHERE
The first query will return cases where the parent has no children or where some of the children match the filter condition. Specificaly, cases where the parent has one child, but it doesn't match the filter condition will be omitted.
The second query will return a row for all parents. If there is no match on filter condition, a NULL will be returned for all of c's columns. This is why you are getting more rows in query 2 - parents with children that don't match the filter condition are output with NULL child values, where in the first query they are filtered out.
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
.
The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
Difference between filtering queries in JOIN and WHERE?
The answer is NO difference, but:
I will always prefer to do the following.
- Always keep the Join Conditions in
ON
clause - Always put the filter's in
where
clause
This makes the query more readable.
So I will use this query:
SELECT value
FROM table1
INNER JOIN table2
ON table1.id = table2.id
WHERE table1.id = 1
However when you are using OUTER JOIN'S
there is a big difference in keeping the filter in the ON
condition and Where
condition.
Logical Query Processing
The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.
(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;
Flow diagram logical query processing
(1) FROM: The FROM phase identifies the query’s source tables and
processes table operators. Each table operator applies a series of
sub phases. For example, the phases involved in a join are (1-J1)
Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM
phase generates virtual table VT1.(1-J1) Cartesian Product: This phase performs a Cartesian product
(cross join) between the two tables involved in the table operator,
generating VT1-J1.- (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on
the predicate that appears in the ON clause (<on_predicate>). Only
rows for which the predicate evaluates to TRUE are inserted into
VT1-J2. - (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to
CROSS JOIN or INNER JOIN), rows from the preserved table or tables
for which a match was not found are added to the rows from VT1-J2 as
outer rows, generating VT1-J3. - (2) WHERE: This phase filters the rows from VT1 based on the
predicate that appears in the WHERE clause (). Only
rows for which the predicate evaluates to TRUE are inserted into VT2. - (3) GROUP BY: This phase arranges the rows from VT2 in groups based
on the column list specified in the GROUP BY clause, generating VT3.
Ultimately, there will be one result row per group. - (4) HAVING: This phase filters the groups from VT3 based on the
predicate that appears in the HAVING clause (<having_predicate>).
Only groups for which the predicate evaluates to TRUE are inserted
into VT4. - (5) SELECT: This phase processes the elements in the SELECT clause,
generating VT5. - (5-1) Evaluate Expressions: This phase evaluates the expressions in
the SELECT list, generating VT5-1. - (5-2) DISTINCT: This phase removes duplicate rows from VT5-1,
generating VT5-2. - (5-3) TOP: This phase filters the specified top number or percentage
of rows from VT5-2 based on the logical ordering defined by the ORDER
BY clause, generating the table VT5-3. - (6) ORDER BY: This phase sorts the rows from VT5-3 according to the
column list specified in the ORDER BY clause, generating the cursor
VC6.
it is referred from book "T-SQL Querying (Developer Reference)"
Sql LEFT OUTER JOIN with WHERE clause
Move the constraint to your on clause.
select *
from request r
left join requestStatus rs
on r.requestID = rs.requestID
--and status_id = 1
and status_id <> 2
What's happening to you is that the outer join is performed first. Any rows coming from the outer join that don't have matches will have nulls in all the columns. Then your where clause is applied, but since 1 <> null, it's not going to work like you want it to.
EDIT: Changed on clause based on Piyush's comment.
WHERE Clause vs ON when using JOIN
No, the query optimizer is smart enough to choose the same execution plan for both examples.
You can use SHOWPLAN
to check the execution plan.
Nevertheless, you should put all join connection on the ON
clause and all the restrictions on the WHERE
clause.
sql left join criteria in join vs where clause
If they can have multiple alerts, theoretically. However since you are excluding payments with alerts, this should not be a problem. If you were including them it could be. If this was a problem, you should use a "not in" subquery instead of left outer join since that can cause duplicate records if it's not 1:1.
Having criteria in the where clause excludes the entire row if it doesn't match the criteria. Having it in the join clause means the joined record is not shown but the "parent" is.
Why would LEFT JOIN on a field to then later filter it out in WHERE clause?
The query you have in the question is basically equivalent to the following query:
SELECT ID, Name, Phone
FROM Table1
WHERE NOT EXISTS
(
SELECT 1
FROM Table2
WHERE Table1.ID = Table2.ID
)
Meaning it selects all the records in Table1 that does not have a correlated record in Table2.
The execution plan for both queries will most likely be the same (Personally, I've never seen a case when they produce a different execution plan, but I don't rule that out), so both queries should be equally efficient, and it's up to you to decide whether the left join or the exists syntax is more readable to you.
Why and when a LEFT JOIN with condition in WHERE clause is not equivalent to the same LEFT JOIN in ON?
The on
clause is used when the join
is looking for matching rows. The where
clause is used to filter rows after all the joining is done.
An example with Disney toons voting for president:
declare @candidates table (name varchar(50));
insert @candidates values
('Obama'),
('Romney');
declare @votes table (voter varchar(50), voted_for varchar(50));
insert @votes values
('Mickey Mouse', 'Romney'),
('Donald Duck', 'Obama');
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
and v.voter = 'Donald Duck'
This still returns Romney
even though Donald
didn't vote for him. If you move the condition from the on
to the where
clause:
select *
from @candidates c
left join
@votes v
on c.name = v.voted_for
where v.voter = 'Donald Duck'
Romney
will no longer be in the result set.
Related Topics
How to Extend the Query to Add 0 in the Cell When No Activity Is Performed
How to Pass Column Name as Parameter in Select Statement SQL Server
Sql: Difference Between Two Dates
Using Openxml in SQL Server 2008 Stored Proc - Insert Order Differs from Xml Document
How to Specify SQL Sort Order in SQL Query
Rename a Column in All the Tables - SQL
Calculating Age from Birthday with Oracle Plsql Trigger and Insert the Age in Table
How to Select Column Which Field Name Contains a Dot
Openrowset and Opendataset Without Sysadmin Rights
Getting Extra Rows - After Joing the 3 Tables Using Left Join
Count Distinct Over Partition by SQL
Ora-12728: Invalid Range in Regular Expression
Getting Unavailable Dates for Renting a Product That Has Stocks
Stored Procedure Failing on a Specific User
Insert Blank Row Between Groups of Rows and Sorted by Id in SQL