Difference Between Filtering Queries in Join and Where

Difference between filtering queries in JOIN and WHERE?

The answer is NO difference, but:

I will always prefer to do the following.

  • Always keep the Join Conditions in ON clause
  • Always put the filter's in where clause

This makes the query more readable.

So I will use this query:

SELECT value
FROM table1
INNER JOIN table2
ON table1.id = table2.id
WHERE table1.id = 1

However when you are using OUTER JOIN'S there is a big difference in keeping the filter in the ON condition and Where condition.

Logical Query Processing

The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.

(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;

Flow diagram logical query processing

Sample Image

  • (1) FROM: The FROM phase identifies the query’s source tables and
    processes table operators. Each table operator applies a series of
    sub phases. For example, the phases involved in a join are (1-J1)
    Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM
    phase generates virtual table VT1.

  • (1-J1) Cartesian Product: This phase performs a Cartesian product
    (cross join) between the two tables involved in the table operator,
    generating VT1-J1.

  • (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on
    the predicate that appears in the ON clause (<on_predicate>). Only
    rows for which the predicate evaluates to TRUE are inserted into
    VT1-J2.
  • (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to
    CROSS JOIN or INNER JOIN), rows from the preserved table or tables
    for which a match was not found are added to the rows from VT1-J2 as
    outer rows, generating VT1-J3.
  • (2) WHERE: This phase filters the rows from VT1 based on the
    predicate that appears in the WHERE clause (). Only
    rows for which the predicate evaluates to TRUE are inserted into VT2.
  • (3) GROUP BY: This phase arranges the rows from VT2 in groups based
    on the column list specified in the GROUP BY clause, generating VT3.
    Ultimately, there will be one result row per group.
  • (4) HAVING: This phase filters the groups from VT3 based on the
    predicate that appears in the HAVING clause (<having_predicate>).
    Only groups for which the predicate evaluates to TRUE are inserted
    into VT4.
  • (5) SELECT: This phase processes the elements in the SELECT clause,
    generating VT5.
  • (5-1) Evaluate Expressions: This phase evaluates the expressions in
    the SELECT list, generating VT5-1.
  • (5-2) DISTINCT: This phase removes duplicate rows from VT5-1,
    generating VT5-2.
  • (5-3) TOP: This phase filters the specified top number or percentage
    of rows from VT5-2 based on the logical ordering defined by the ORDER
    BY clause, generating the table VT5-3.
  • (6) ORDER BY: This phase sorts the rows from VT5-3 according to the
    column list specified in the ORDER BY clause, generating the cursor
    VC6.

it is referred from book "T-SQL Querying (Developer Reference)"

Which SQL query is faster? Filter on Join criteria or Where clause?

Performance-wise, they are the same (and produce the same plans)

Logically, you should make the operation that still has sense if you replace INNER JOIN with a LEFT JOIN.

In your very case this will look like this:

SELECT  *
FROM TableA a
LEFT JOIN
TableXRef x
ON x.TableAID = a.ID
AND a.ID = 1
LEFT JOIN
TableB b
ON x.TableBID = b.ID

or this:

SELECT  *
FROM TableA a
LEFT JOIN
TableXRef x
ON x.TableAID = a.ID
LEFT JOIN
TableB b
ON b.id = x.TableBID
WHERE a.id = 1

The former query will not return any actual matches for a.id other than 1, so the latter syntax (with WHERE) is logically more consistent.

what is the difference between using filter condition in WHERE clause and JOIN condition

With an INNER JOIN, there is no difference, either in terms of performance or the definition of the result set. There is a difference with OUTER JOINs.

The WHERE clause filters the results of the FROM after the FROM is evaluated. So, consider these two queries:

SELECT E.id, D.name, E.sal
from EMP E LEFT JOIN
DEP D
ON E.id = D.id
WHERE E.sal > 100;

SELECT E.id, D.name, E.sal
from EMP E LEFT JOIN
DEP D
ON E.id = D.id AND E.sal > 100;

The first filters the EMP table to get the appropriate ids. The second does not filter the EMP table. Why not? Well, the definition of LEFT JOIN says to take all rows in the first table, regardless of whether the ON finds a matching record in the second table. So, there is no filtering.

Now, consider this version:

SELECT E.id, D.name, E.sal
from EMP E RIGHT JOIN
DEP D
ON E.id = D.id
WHERE E.sal > 100;

SELECT E.id, D.name, E.sal
from EMP E RIGHT JOIN
DEP D
ON E.id = D.id AND E.sal > 100;

The first turns the RIGHT JOIN to an INNER JOIN. Why? Because for non-matching rows, E.sal is NULL and fails the WHERE. The second version keeps all departments, even those that have no employees matching the condition. Note: I (and others) much prefer left outer joins to right outer joins in terms of following the logic. The logic for a left outer join is simple: keep all rows in the first table.

The FULL OUTER JOIN combines these two situations.

What's the difference between filtering in the WHERE clause compared to the ON clause?

My answer may be a bit off-topic, but I would like to highlight a problem that may occur when you turn your INNER JOIN into an OUTER JOIN.

In this case, the most important difference between putting predicates (test conditions) on the ON or WHERE clauses is that you can turn LEFT or RIGHT OUTER JOINS into INNER JOINS without noticing it, if you put fields of the table to be left out in the WHERE clause.

For example, in a LEFT JOIN between tables A and B, if you include a condition that involves fields of B on the WHERE clause, there's a good chance that there will be no null rows returned from B in the result set. Effectively, and implicitly, you turned your LEFT JOIN into an INNER JOIN.

On the other hand, if you include the same test in the ON clause, null rows will continue to be returned.

For example, take the query below:

SELECT * FROM A 
LEFT JOIN B
ON A.ID=B.ID

The query will also return rows from A that do not match any of B.

Take this second query:

SELECT * FROM A 
LEFT JOIN B
WHERE A.ID=B.ID

This second query won't return any rows from A that don't match B, even though you think it will because you specified a LEFT JOIN. That's because the test A.ID=B.ID will leave out of the result set any rows with B.ID that are null.

That's why I favor putting predicates in the ON clause rather than in the WHERE clause.

SQL left join with filter in JOIN condition vs filter in WHERE clause

The big difference with the Where condition b.status is null or b.status in (10, 100)
is when b.status is say 1 as well as b.id=a.id

In the first query you will still get the row from table A with corresponding B part as NULL as On condition is not fully satisfied.
In the second query you will get the row in the JOIN for both a and b tables which will be lost in the where clause.

Difference between where and and clause in join sql query

Base on the following two test result

select a.id, a.name,a.country from table a left join table b
on a.id = b.id
where a.name is not null

is faster (237 Vs 460). As far as I know, it is a standard.

Sample Image

Sample Image



Related Topics



Leave a reply



Submit