Does INNER JOIN performance depends on order of tables?
Aliases, and the order of the tables in the join (assuming it's INNER JOIN
) doesn't affect the final outcome and thus doesn't affect performance since the order is replace (if needed) when the query is executed.
You can read some more basic concepts about relational algebra here:
http://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators
Does the join order matter in SQL?
For INNER
joins, no, the order doesn't matter. The queries will return same results, as long as you change your selects from SELECT *
to SELECT a.*, b.*, c.*
.
For (LEFT
, RIGHT
or FULL
) OUTER
joins, yes, the order matters - and (updated) things are much more complicated.
First, outer joins are not commutative, so a LEFT JOIN b
is not the same as b LEFT JOIN a
Outer joins are not associative either, so in your examples which involve both (commutativity and associativity) properties:
a LEFT JOIN b
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id
is equivalent to:
a LEFT JOIN c
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id
but:
a LEFT JOIN b
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id
AND c.bc_id = b.bc_id
is not equivalent to:
a LEFT JOIN c
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id
AND b.bc_id = c.bc_id
Another (hopefully simpler) associativity example. Think of this as (a LEFT JOIN b) LEFT JOIN c
:
a LEFT JOIN b
ON b.ab_id = a.ab_id -- AB condition
LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition
This is equivalent to a LEFT JOIN (b LEFT JOIN c)
:
a LEFT JOIN
b LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition
ON b.ab_id = a.ab_id -- AB condition
only because we have "nice" ON
conditions. Both ON b.ab_id = a.ab_id
and c.bc_id = b.bc_id
are equality checks and do not involve NULL
comparisons.
You can even have conditions with other operators or more complex ones like: ON a.x <= b.x
or ON a.x = 7
or ON a.x LIKE b.x
or ON (a.x, a.y) = (b.x, b.y)
and the two queries would still be equivalent.
If however, any of these involved IS NULL
or a function that is related to nulls like COALESCE()
, for example if the condition was b.ab_id IS NULL
, then the two queries would not be equivalent.
Does Sql JOIN order affect performance?
No, the JOIN by order is changed during optimization.
The only caveat is the Option FORCE ORDER which will force joins to happen in the exact order you have them specified.
does the order of condition in join affect query performance?
The order of conditions in the on
clause should not affect performance. Why not? At a high level are three steps to SQL execution:
- Parse the query
- Construct and optimize the "executable" code
- Execute the code
The second level optimizes the query and should take into account different methods of executing the query. The join conditions are part of this optimization -- all at once.
In theory, it does not matter what the order of the join
s are either, although in a very complex query, it could matter.
MySql In an inner join does it matter which table comes first?
Instead of the following:
select a.postsTitle
from posts a
inner join bookmarks b
on b.userId = a.userId
and b.userId = :userId
You should consider formatting your JOIN in this format, using the WHERE
clause, and proper capitalization:
SELECT p.postsTitle
FROM bookmarks b
INNER JOIN posts p
ON p.userId = b.userId
WHERE b.userId = :userId
While it makes no difference (performance wise) to MySQL which order you put the tables in with INNER JOIN
(MySQL treats them as equal and will optimize them the same way), it's convention to put the table that you are applying the WHERE
clause to first. In fact, assuming proper indexes, MySQL will most likely start with the table that has the WHERE
clause because it narrows down the result set, and MySQL likes to start with the set that has the fewest rows.
It's also convention to put the joined table's column first in the ON clause. It just reads more logically. While you're at it, use logical table aliases.
The only caveat is if you don't name your columns and instead use SELECT *
like the following:
SELECT *
FROM bookmarks b
INNER JOIN posts p
ON p.userId = b.userId
WHERE b.userId = :userId
You'll get the columns in the order they're listed in the query. In this case, you'll get the columns for bookmarks
, followed by the columns for posts
.
Most would say never use SELECT *
in a production query, but if you really must return all columns, and you needed the columns from posts
first, you could simply do the following:
SELECT p.*, b.*
FROM bookmarks b
INNER JOIN posts p
ON p.userId = b.userId
WHERE b.userId = :userId
It's always good to be explicit about the returned result set.
Order of tables in INNER JOIN
So does it imply that if statistics gathered from database objects change, then results would also change?
No. The same query will always produce the same results (provided, of course, that the underlying data is the same). What the author is explaining is that the database may choose a strategy or another to process the query (starting from one table or another, using a this or that algorithm to join the rows, and so on). That decision is made based on many factors, some of them being based on information that is available in the statistics.
The key point is that SQL is a declarative language, not a procedural language: you don't get to chose how the database handles the query, you just tell it what result you want.
However, regardless of the algorithm that the database chooses, the result is guaranteed to be consistent.
Note that there are edge case where the database does not guarantee that results are the same for consecutive executions of the same query (like a query without a row limiting clause but without an order by
): it's the responsibility of the client to provide a query whose results are properly defined (the language does gives you enough rope to hang yourself, if you really want to).
Should SQL JOINs be placed in particular order for performance reasons?
The documentation for MySQL states "The join optimizer calculates the order in which tables should be joined".
This order is determined based on information about the sizes of the tables and other factors, such as the presence of indexes.
You should put the joins in the order that makes the most sense for reading and maintaining the query.
Related Topics
Does Facebook Fql Contain the SQL Like Operator
How to Find the Total Number of Used Days in a Month
Bigquery - JSON_Extract All Elements from an Array
Delete All Records Except the Most Recent One
SQL Server Decimal(30,10) Losing Last 2 Decimals
Why Would You Use "As" When Aliasing a SQL Table
Oracle Db Equivalent of on Duplicate Key Update
Execution Order of Conditions in SQL 'Where' Clause
Update Table Based on Another Table
SQL - Remove the Duplicate Results
Count Number of Null Values in Each Column in SQL
Combine Consecutive Date Ranges
Comma-Separated List as a Result of Select Statement in Oracle
Split String by Comma in SQL Server 2008
Possible to Do a Delete with a Having Clause
Select Only Some Columns from a Table on a Join