Join statement order of operation
The placement of the ON
clauses controls the logical order of evaluation.
So first the t1 LEFT JOIN t2 ON t1.fk = t2.pk
happens. The result of this join is a virtual table containing all the matching rows from t1, t2
and (because it is a left outer join) any non matched t1
rows are also preserved with null values for the t2
columns.
This virtual table then participates in the next join. JOIN t3 ON t2.fk = t3.pk
Any t2
records that do not match rows in t1
are not part of the virtual table output from the first stage so won't appear in the final result. Additionally this inner join on t2.fk = t3.pk
will lose any NULL
values of t2.fk
effectively turning your whole thing back into inner joins.
Logical Query Processing is explained well by Itzik Ben Gan here
WHERE and JOIN order of operation
this would depend on many many things (table size, index, key distribution, etc), you should just check the execution plan:
you don't say which database, but here are some ways:
MySql EXPLAIN
SQL Server SET SHOWPLAN_ALL (Transact-SQL)
Oracle EXPLAIN PLAN
what is explain in teradata?
Teradata Capture and compare plans faster with Visual Explain and XML plan logging
Does the join order matter in SQL?
For INNER
joins, no, the order doesn't matter. The queries will return same results, as long as you change your selects from SELECT *
to SELECT a.*, b.*, c.*
.
For (LEFT
, RIGHT
or FULL
) OUTER
joins, yes, the order matters - and (updated) things are much more complicated.
First, outer joins are not commutative, so a LEFT JOIN b
is not the same as b LEFT JOIN a
Outer joins are not associative either, so in your examples which involve both (commutativity and associativity) properties:
a LEFT JOIN b
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id
is equivalent to:
a LEFT JOIN c
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id
but:
a LEFT JOIN b
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id
AND c.bc_id = b.bc_id
is not equivalent to:
a LEFT JOIN c
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id
AND b.bc_id = c.bc_id
Another (hopefully simpler) associativity example. Think of this as (a LEFT JOIN b) LEFT JOIN c
:
a LEFT JOIN b
ON b.ab_id = a.ab_id -- AB condition
LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition
This is equivalent to a LEFT JOIN (b LEFT JOIN c)
:
a LEFT JOIN
b LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition
ON b.ab_id = a.ab_id -- AB condition
only because we have "nice" ON
conditions. Both ON b.ab_id = a.ab_id
and c.bc_id = b.bc_id
are equality checks and do not involve NULL
comparisons.
You can even have conditions with other operators or more complex ones like: ON a.x <= b.x
or ON a.x = 7
or ON a.x LIKE b.x
or ON (a.x, a.y) = (b.x, b.y)
and the two queries would still be equivalent.
If however, any of these involved IS NULL
or a function that is related to nulls like COALESCE()
, for example if the condition was b.ab_id IS NULL
, then the two queries would not be equivalent.
Which performs first WHERE clause or JOIN clause
The conceptual order of query processing is:
1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY
But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:
CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))
;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))
INSERT INTO test1(name) SELECT 'a' FROM cte
Now run 2 queries:
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1
SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1
Notice that the first query will filter most rows out in the join
condition, but the second query filters in the where
condition. Look at the produced plans:
1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)
2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)
This means that in the first query optimized, the engine decided first to evaluate the join
condition to filter out rows. In the second query, it evaluated the where
clause first.
SQL order of operations
It depends on the database.
On SQL Server, run: SET SHOWPLAN_ALL ON
then run the query, you will get an idea of what happens when it runs.
SQL joining three tables, join precedence
All kinds of outer and normal joins are in the same precedence class and operators take effect left-to-right at a given nesting level of the query. You can put the join expression on the right side in parentheses to cause it to take effect first. Remember that you will have to move the ON
clauses around so that they stay with their joins—the join in parentheses takes its ON
clause with it into the parentheses, so it now comes textually before the other ON
clause which will be after the parentheses in the outer join statement.
(PostgreSQL example)
In
SELECT * FROM a LEFT JOIN b ON (a.id = b.id) JOIN c ON (b.ref = c.id);
the a-b join takes effect first, but we can force the b-c join to take effect first by putting it in parentheses, which looks like:
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
Often you can express the same thing without extra parentheses by moving the joins around and changing the direction of the outer joins, e.g.
SELECT * FROM b JOIN c ON (b.ref = c.id) RIGHT JOIN a ON (a.id = b.id);
In what order are MySQL JOINs evaluated?
USING (fieldname) is a shorthand way of saying ON table1.fieldname = table2.fieldname.
SQL doesn't define the 'order' in which JOINS are done because it is not the nature of the language. Obviously an order has to be specified in the statement, but an INNER JOIN can be considered commutative: you can list them in any order and you will get the same results.
That said, when constructing a SELECT ... JOIN, particularly one that includes LEFT JOINs, I've found it makes sense to regard the third JOIN as joining the new table to the results of the first JOIN, the fourth JOIN as joining the results of the second JOIN, and so on.
More rarely, the specified order can influence the behaviour of the query optimizer, due to the way it influences the heuristics.
No. The way the query is assembled, it requires that companies and users both have a companyid, jobs has a userid and a jobid and useraccounts has a userid. However, only one of companies or user needs a userid for the JOIN to work.
The WHERE clause is filtering the whole result -- i.e. all JOINed columns -- using a column provided by the jobs table.
Order of join operations: would these two FROM clauses produce the same results?
I have no idea if the 1st is equivalent to the 2nd (first because the queries are unfriendly formatted, to say the least and second because RIGHT
joins are kind of confusing, since many are used to write using LEFT
joins.) But to answer the question:
Is there a way to write the the first one so that no parenthesis are needed?
Yes, you can simply remove the parentheses from the 1st query.
Keeping the parentheses and formatted with some white space:
FROM
SALESTAX
RIGHT JOIN
( ITEMS
RIGHT JOIN
( PINVOICE
INNER JOIN
PINVDET
ON PINVOICE.PNV_INVOICENO = PINVDET.PND_INVOICENO
AND PINVOICE.PNV_Site = PINVDET.PND_Site
)
ON ITEMS.ITE_INVNO = PINVDET.PND_INVNO
)
ON SALESTAX.STX_GroupID = PINVDET.PND_TAX1
FULL JOIN
( CUSTMS
RIGHT JOIN
CUSMER
ON CUSTMS.TMS_CODE = CUSMER.CUS_TERM
)
ON PINVDET.PND_CUSTID = CUSMER.CUS_CustID
Without parentheses and white-space formatted:
FROM
SALESTAX
RIGHT JOIN
ITEMS
RIGHT JOIN
PINVOICE
INNER JOIN
PINVDET
ON PINVOICE.PNV_INVOICENO = PINVDET.PND_INVOICENO
AND PINVOICE.PNV_Site = PINVDET.PND_Site
ON ITEMS.ITE_INVNO = PINVDET.PND_INVNO
ON SALESTAX.STX_GroupID = PINVDET.PND_TAX1
FULL JOIN
CUSTMS
RIGHT JOIN
CUSMER
ON CUSTMS.TMS_CODE = CUSMER.CUS_TERM
ON PINVDET.PND_CUSTID = CUSMER.CUS_CustID
To answer the other question, about the 2nd query, no it isn't equivalent. You missed the table aliases and changed an inner join to left join. This is equivalent to the 1st:
FROM CUSMER
LEFT JOIN
CUSTMS ON CUSTMS.TMS_CODE = CUSMER.CUS_TERM
FULL JOIN
PINVDET
INNER JOIN -- this is changed
PINVOICE ON PINVOICE.PNV_INVOICENO = PINVDET.PND_INVOICENO
AND PINVOICE.PNV_Site = PINVDET.PND_Site
LEFT JOIN
ITEMS ON ITEMS.ITE_INVNO = PINVDET.PND_INVNO
LEFT JOIN
SALESTAX ON SALESTAX.STX_GroupID = PINVDET.PND_TAX1
ON PINVDET.PND_CUSTID = CUSMER.CUS_CustID
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
. The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
Related Topics
How to Add a Column and Make It a Foreign Key in Single MySQL Statement
Subtract One Day from Datetime
Join to Only the "Latest" Record with T-Sql
Sqlite Multi-Primary Key on a Table, One of Them Is Auto Increment
Date/Timestamp to Record When a Record Was Added to the Table
Difference Between Decimal and Numeric
Flattening Intersecting Timespans
How to Write Union in Doctrine 2.0
SQL Update Query Syntax with Inner Join
How to Do a Simple 'Find and Replace" in Mssql
Get All Characters Before Space in MySQL
Split Varchar into Separate Columns in Oracle
Are There Any Free Tools to Generate 'Insert Into' Scripts in Ms SQL Server
Extracting the Total Number of Seconds from an Interval Data-Type
How to Get the Raw Query String from Laravel's Query Builder Before Executing the Query