INNER JOIN ON vs WHERE clause
INNER JOIN
is ANSI syntax that you should use.
It is generally considered more readable, especially when you join lots of tables.
It can also be easily replaced with an OUTER JOIN
whenever a need arises.
The WHERE
syntax is more relational model oriented.
A result of two tables JOIN
ed is a cartesian product of the tables to which a filter is applied which selects only those rows with joining columns matching.
It's easier to see this with the WHERE
syntax.
As for your example, in MySQL (and in SQL generally) these two queries are synonyms.
Also, note that MySQL also has a STRAIGHT_JOIN
clause.
Using this clause, you can control the JOIN
order: which table is scanned in the outer loop and which one is in the inner loop.
You cannot control this in MySQL using WHERE
syntax.
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
. The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
Which performs first WHERE clause or JOIN clause
The conceptual order of query processing is:
1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY
But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:
CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))
;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))
INSERT INTO test1(name) SELECT 'a' FROM cte
Now run 2 queries:
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1
Notice that the first query will filter most rows out in the join
condition, but the second query filters in the where
condition. Look at the produced plans:
1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)
2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)
This means that in the first query optimized, the engine decided first to evaluate the join
condition to filter out rows. In the second query, it evaluated the where
clause first.
INNER JOIN condition in WHERE clause or ON clause?
For inner joins like this they are logically equivalent. However, you can run in to situations where a condition in the join clause means something different than a condition in the where clause.
As a simple illustration, imagine you do a left join like so;
select x.id
from x
left join y
on x.id = y.id
;
Here we're taking all the rows from x, regardless of whether there is a matching id in y. Now let's say our join condition grows - we're not just looking for matches in y based on the id but also on id_type.
select x.id
from x
left join y
on x.id = y.id
and y.id_type = 'some type'
;
Again this gives all the rows in x regardless of whether there is a matching (id, id_type) in y.
This is very different, though:
select x.id
from x
left join y
on x.id = y.id
where y.id_type = 'some type'
;
In this situation, we're picking all the rows of x and trying to match to rows from y. Now for rows for which there is no match in y, y.id_type will be null. Because of that, y.id_type = 'some type' isn't satisfied, so those rows where there is no match are discarded, which effectively turned this in to an inner join.
Long story short: for inner joins it doesn't matter where the conditions go but for outer joins it can.
SQL INNER JOIN vs LEFT JOIN with a WHERE
Yes, they will return the same result. The left join without the where clause would read as show me all the records from the header table and the related items from the details table or null for the details where there are no matches.
Adding a where clause relating the ids effectively transforms the left join to an inner join by eliminating the non-matching rows that would have shown up as having null for the detail part.
In some databases, like MS SQL Server, the left join would show up as an inner join in the query execution plan.
Although you stated that you don't want Venn diagrams I can't help referring you to this question and its answers even though they are filled with (in my opinion very helpful) Venn diagrams.
inner join on condition or using where?
SQL is not a procedural language. It is a descriptive language. A query describes the result set that you want to produce.
With an inner join, the two queries in your question are identical -- they produce the same result set under all circumstances. Which to prefer is a stylistic preference. MySQL should treat the two the same way from an optimization perspective.
One preference is that filters on a single table are more appropriate for WHERE
and ON
.
With an outer join, the two queries are not the same, and you should use the one that expresses your intent.
WHERE Clause vs ON when using JOIN
No, the query optimizer is smart enough to choose the same execution plan for both examples.
You can use SHOWPLAN
to check the execution plan.
Nevertheless, you should put all join connection on the ON
clause and all the restrictions on the WHERE
clause.
Difference between filtering queries in JOIN and WHERE?
The answer is NO difference, but:
I will always prefer to do the following.
- Always keep the Join Conditions in
ON
clause - Always put the filter's in
where
clause
This makes the query more readable.
So I will use this query:
SELECT value
FROM table1
INNER JOIN table2
ON table1.id = table2.id
WHERE table1.id = 1
However when you are using OUTER JOIN'S
there is a big difference in keeping the filter in the ON
condition and Where
condition.
Logical Query Processing
The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.
(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;
Flow diagram logical query processing
(1) FROM: The FROM phase identifies the query’s source tables and
processes table operators. Each table operator applies a series of
sub phases. For example, the phases involved in a join are (1-J1)
Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM
phase generates virtual table VT1.(1-J1) Cartesian Product: This phase performs a Cartesian product
(cross join) between the two tables involved in the table operator,
generating VT1-J1.- (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on
the predicate that appears in the ON clause (<on_predicate>). Only
rows for which the predicate evaluates to TRUE are inserted into
VT1-J2. - (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to
CROSS JOIN or INNER JOIN), rows from the preserved table or tables
for which a match was not found are added to the rows from VT1-J2 as
outer rows, generating VT1-J3. - (2) WHERE: This phase filters the rows from VT1 based on the
predicate that appears in the WHERE clause (). Only
rows for which the predicate evaluates to TRUE are inserted into VT2. - (3) GROUP BY: This phase arranges the rows from VT2 in groups based
on the column list specified in the GROUP BY clause, generating VT3.
Ultimately, there will be one result row per group. - (4) HAVING: This phase filters the groups from VT3 based on the
predicate that appears in the HAVING clause (<having_predicate>).
Only groups for which the predicate evaluates to TRUE are inserted
into VT4. - (5) SELECT: This phase processes the elements in the SELECT clause,
generating VT5. - (5-1) Evaluate Expressions: This phase evaluates the expressions in
the SELECT list, generating VT5-1. - (5-2) DISTINCT: This phase removes duplicate rows from VT5-1,
generating VT5-2. - (5-3) TOP: This phase filters the specified top number or percentage
of rows from VT5-2 based on the logical ordering defined by the ORDER
BY clause, generating the table VT5-3. - (6) ORDER BY: This phase sorts the rows from VT5-3 according to the
column list specified in the ORDER BY clause, generating the cursor
VC6.
it is referred from book "T-SQL Querying (Developer Reference)"
SQL Query - INNER JOIN - WHERE vs AND
There is no difference at all between the two queries. As a matter of convention, conditions between the two queries are often put in the on
clause:
select C.*
from customer C inner join
salesman S
on C.salesman_id = S.salesman_id and S.city <> C.city
where S.commission > 0.12;
Functionally, though, additional conditions can go in either the on
clause or the where
clause -- the results and performance should be the same. Note: This is not true of an outer join. In that case, conditions should often go in the on
clause.
Related Topics
Stored Procedure That Automatically Delete Rows Older Than 7 Days in MySQL
How to Obtain a Query Execution Plan in SQL Server
You Can't Specify Target Table For Update in from Clause
How to Select the Newest Four Items Per Category
Selecting With Multiple Where Conditions on Same Column
Any Downsides of Using Data Type "Text" For Storing Strings
SQL Server - Return Value After Insert
In VS or in the SQL Where Clause
How to Calculate Percentage With a SQL Statement
MySQL - Error 1045 - Access Denied
Remove Duplicate Rows in MySQL
How Stuff and 'For Xml Path' Work in SQL Server
SQL Statement Is Ignoring Where Parameter