SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
. The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
INNER JOIN ON vs WHERE clause
INNER JOIN
is ANSI syntax that you should use.
It is generally considered more readable, especially when you join lots of tables.
It can also be easily replaced with an OUTER JOIN
whenever a need arises.
The WHERE
syntax is more relational model oriented.
A result of two tables JOIN
ed is a cartesian product of the tables to which a filter is applied which selects only those rows with joining columns matching.
It's easier to see this with the WHERE
syntax.
As for your example, in MySQL (and in SQL generally) these two queries are synonyms.
Also, note that MySQL also has a STRAIGHT_JOIN
clause.
Using this clause, you can control the JOIN
order: which table is scanned in the outer loop and which one is in the inner loop.
You cannot control this in MySQL using WHERE
syntax.
WHERE Clause vs ON when using JOIN
No, the query optimizer is smart enough to choose the same execution plan for both examples.
You can use SHOWPLAN
to check the execution plan.
Nevertheless, you should put all join connection on the ON
clause and all the restrictions on the WHERE
clause.
What's the difference between where clause and on clause when table left join?
The where
clause applies to the whole resultset; the on clause
only applies to the join in question.
In the example supplied, all of the additional conditions related to fields on the inner side of the join - so in this example, the two queries are effectively identical.
However, if you had included a condition on a value in the table in the outer side of the join, it would have made a significant difference.
You can get more from this link: http://ask.sqlservercentral.com/questions/80067/sql-data-filter-condition-in-join-vs-where-clause
For example:
select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 and t2.f4=1
select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 where t2.f4=1
- do different things - the former will left join to t2 records where f4 is 1, while the latter has effectively been turned back into an inner join to t2.
inner join on condition or using where?
SQL is not a procedural language. It is a descriptive language. A query describes the result set that you want to produce.
With an inner join, the two queries in your question are identical -- they produce the same result set under all circumstances. Which to prefer is a stylistic preference. MySQL should treat the two the same way from an optimization perspective.
One preference is that filters on a single table are more appropriate for WHERE
and ON
.
With an outer join, the two queries are not the same, and you should use the one that expresses your intent.
Condition within JOIN or WHERE
The relational algebra allows interchangeability of the predicates in the WHERE
clause and the INNER JOIN
, so even INNER JOIN
queries with WHERE
clauses can have the predicates rearrranged by the optimizer so that they may already be excluded during the JOIN
process.
I recommend you write the queries in the most readable way possible.
Sometimes this includes making the INNER JOIN
relatively "incomplete" and putting some of the criteria in the WHERE
simply to make the lists of filtering criteria more easily maintainable.
For example, instead of:
SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
AND c.State = 'NY'
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
AND a.Status = 1
Write:
SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
AND a.Status = 1
But it depends, of course.
Which performs first WHERE clause or JOIN clause
The conceptual order of query processing is:
1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY
But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:
CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))
;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))
INSERT INTO test1(name) SELECT 'a' FROM cte
Now run 2 queries:
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1
Notice that the first query will filter most rows out in the join
condition, but the second query filters in the where
condition. Look at the produced plans:
1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)
2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)
This means that in the first query optimized, the engine decided first to evaluate the join
condition to filter out rows. In the second query, it evaluated the where
clause first.
INNER JOIN condition in WHERE clause or ON clause?
For inner joins like this they are logically equivalent. However, you can run in to situations where a condition in the join clause means something different than a condition in the where clause.
As a simple illustration, imagine you do a left join like so;
select x.id
from x
left join y
on x.id = y.id
;
Here we're taking all the rows from x, regardless of whether there is a matching id in y. Now let's say our join condition grows - we're not just looking for matches in y based on the id but also on id_type.
select x.id
from x
left join y
on x.id = y.id
and y.id_type = 'some type'
;
Again this gives all the rows in x regardless of whether there is a matching (id, id_type) in y.
This is very different, though:
select x.id
from x
left join y
on x.id = y.id
where y.id_type = 'some type'
;
In this situation, we're picking all the rows of x and trying to match to rows from y. Now for rows for which there is no match in y, y.id_type will be null. Because of that, y.id_type = 'some type' isn't satisfied, so those rows where there is no match are discarded, which effectively turned this in to an inner join.
Long story short: for inner joins it doesn't matter where the conditions go but for outer joins it can.
Related Topics
Generate a Resultset of Incrementing Dates in Tsql
Why No Windowed Functions in Where Clauses
What Is Best Tool to Compare Two SQL Server Databases (Schema and Data)
What Are Best Practices For Multi-Language Database Design
How to Dump the Data of Some Sqlite3 Tables
Get Day of Week in SQL Server 2005/2008
How to Do a Case Sensitive Search in Where Clause (I'M Using SQL Server)
MySQL Delete from With Subquery as Condition
MySQL: Invalid Use of Group Function
Creating Table Names That Are Reserved Words/Keywords in Ms SQL Server
Difference Between Top and Limit Keyword in SQL
Ms Access Query: Concatenating Rows Through a Query