Does the Order of Tables in a Join Matter, When Left (Outer) Joins Are Used

Does the order of tables in a join matter, when LEFT (outer) joins are used?

It is the same but it is ambiguous as hell with the implicit CROSS JOINs. Use explicit JOINS.

If you are joining in the WHERE clause then the results may differ because joins and filters are mixed up.

SELECT ....
FROM apples a
JOIN
bananas b ON ...
JOIN
oranges o ON ...
LEFT JOIN
kiwis k ON k.orange_id = o.id
WHERE (filters only)

Notes:

  • INNER JOINS and CROSS JOINS are commutative and associative: order does not matter usually.
  • OUTER JOINS are not, which you identified
  • SQL is declarative: you tell the optimiser what you want, not how to do it. This removes JOIN order considerations (subject to the previous 2 items)

Does the join order matter in SQL?

For INNER joins, no, the order doesn't matter. The queries will return same results, as long as you change your selects from SELECT * to SELECT a.*, b.*, c.*.


For (LEFT, RIGHT or FULL) OUTER joins, yes, the order matters - and (updated) things are much more complicated.

First, outer joins are not commutative, so a LEFT JOIN b is not the same as b LEFT JOIN a

Outer joins are not associative either, so in your examples which involve both (commutativity and associativity) properties:

a LEFT JOIN b 
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id

is equivalent to:

a LEFT JOIN c 
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id

but:

a LEFT JOIN b 
ON b.ab_id = a.ab_id
LEFT JOIN c
ON c.ac_id = a.ac_id
AND c.bc_id = b.bc_id

is not equivalent to:

a LEFT JOIN c 
ON c.ac_id = a.ac_id
LEFT JOIN b
ON b.ab_id = a.ab_id
AND b.bc_id = c.bc_id

Another (hopefully simpler) associativity example. Think of this as (a LEFT JOIN b) LEFT JOIN c:

a LEFT JOIN b 
ON b.ab_id = a.ab_id -- AB condition
LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition

This is equivalent to a LEFT JOIN (b LEFT JOIN c):

a LEFT JOIN  
b LEFT JOIN c
ON c.bc_id = b.bc_id -- BC condition
ON b.ab_id = a.ab_id -- AB condition

only because we have "nice" ON conditions. Both ON b.ab_id = a.ab_id and c.bc_id = b.bc_id are equality checks and do not involve NULL comparisons.

You can even have conditions with other operators or more complex ones like: ON a.x <= b.x or ON a.x = 7 or ON a.x LIKE b.x or ON (a.x, a.y) = (b.x, b.y) and the two queries would still be equivalent.

If however, any of these involved IS NULL or a function that is related to nulls like COALESCE(), for example if the condition was b.ab_id IS NULL, then the two queries would not be equivalent.

Is the order of joining tables indifferent as long as we chose proper join types?

In an inner join, the ordering of the tables in the join doesn't matter - the same rows will make up the result set regardless of the order they are in the join statement.

In either a left or right outer join, the order DOES matter. In A left join B, your result set will contain one row for every record in table A, irrespective of whether there is a matching row in table B. If there are non matching rows, this is likely to be a different result set to B left join A.

In a full outer join, the order again doesn't matter - rows will be produced for each row in each joined table no matter what their order.

Regarding A left join B vs B right join A - these will produce the same results. In simple cases with 2 tables, swapping the tables and changing the direction of the outer join will result in the same result set.

This will also apply to 3 or more tables if all of the outer joins are in the same direction - A left join B left join C will give the same set of results as C right join B right join A.

If you start mixing left and right joins, then you will need to start being more careful. There will almost always be a way to make an equivalent query with re-ordered tables, but at that point sub-queries or bracketing off expressions might be the best way to clarify what you are doing.

As another commenter states, using whatever makes your purpose most clear is usually the best option. The ordering of the tables in your query should make little or no difference performance wise, as the query optimiser should work this out (although the only way to be sure of this would be to check the execution plans for each option with your own queries and data).

SQL Server : does order of full outer join matter?

No, rearranging the JOIN orders should not affect the performance. MSSQL (as with other DBMS) has a query optimizer whose job it is to find the most efficient query plan for any given query. Generally, these do a pretty good job - so you're unlikely to beat the optimizer easily.

That said, they do get it wrong occasionally. That's where reading an execution plan comes into play. You can add JOIN hints to tell MSSQL how to join your tables (at which point, ordering does matter). You'd generally order from smallest to largest table (though, with a FULL JOIN, it's not likely to matter very much) and follow the rules of thumb for join types.

Since you're doing FULL JOINS, you're basically reading the entirety of 4 tables off disk. That's likely to be very expensive. You may want to re-examine the problem, and see if it can be accomplished in a different way.

Explain which table to choose FROM in a JOIN statement

The order doesn't matter in an INNER JOIN.

However, it does matter in LEFT JOIN and RIGHT JOIN. In a LEFT JOIN, the table in the FROM clause is the primary table; the result will contain every row selected from this table, while rows named in the LEFT JOIN table can be missing (these columns will be NULL in the result). RIGHT JOIN is similar but the reverse: rows can be missing in the table named in FROM.

For instance, if you change your query to use LEFT JOIN, you'll see customers with no orders. But if you swapped the order of the tables and used a LEFT JOIN, you wouldn't see these customers. You would see orders with no customer (although such rows probably shouldn't exist).

What is the difference between INNER JOIN and OUTER JOIN?

Assuming you're joining on columns with no duplicates, which is a very common case:

  • An inner join of A and B gives the result of A intersect B, i.e. the inner part of a Venn diagram intersection.

  • An outer join of A and B gives the results of A union B, i.e. the outer parts of a Venn diagram union.

Examples

Suppose you have two tables, with a single column each, and data as follows:

A    B
- -
1 3
2 4
3 5
4 6

Note that (1,2) are unique to A, (3,4) are common, and (5,6) are unique to B.

Inner join

An inner join using either of the equivalent queries gives the intersection of the two tables, i.e. the two rows they have in common.

select * from a INNER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b;

a | b
--+--
3 | 3
4 | 4

Left outer join

A left outer join will give all rows in A, plus any common rows in B.

select * from a LEFT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a = b.b(+);

a | b
--+-----
1 | null
2 | null
3 | 3
4 | 4

Right outer join

A right outer join will give all rows in B, plus any common rows in A.

select * from a RIGHT OUTER JOIN b on a.a = b.b;
select a.*, b.* from a,b where a.a(+) = b.b;

a | b
-----+----
3 | 3
4 | 4
null | 5
null | 6

Full outer join

A full outer join will give you the union of A and B, i.e. all the rows in A and all the rows in B. If something in A doesn't have a corresponding datum in B, then the B portion is null, and vice versa.

select * from a FULL OUTER JOIN b on a.a = b.b;

a | b
-----+-----
1 | null
2 | null
3 | 3
4 | 4
null | 6
null | 5

How do I decide when to use right joins/left joins or inner joins Or how to determine which table is on which side?

Yes, it depends on the situation you are in.

Why use SQL JOIN?

Answer: Use the SQL JOIN whenever multiple tables must be accessed through an SQL SELECT statement and no results should be returned if there is not a match between the JOINed tables.

Reading this original article on The Code Project will help you a lot: Visual Representation of SQL Joins.

alt text

Also check this post: SQL SERVER – Better Performance – LEFT JOIN or NOT IN?.

Find original one at: Difference between JOIN and OUTER JOIN in MySQL.

Self joins. Does Inner, Outer, or Left matter?

It all depends on what you want to do with the data. This answer does a great job of detailing what a self inner join might look like. I recently wrote a report that required comparing grades from two courses a student took in succession. It went something like this:

Given a table student_course:

STUDENT_ID  COURSE  GRADE
1 MTH251 A
1 MTH252 B
2 MTH251 A
2 MTH252 A
3 MTH251 B
3 MTH252 C

Query:

SELECT course1.student_id
, course1.course AS course1
, course1.grade AS grade1
, course2.course AS course2
, course2.grade AS grade2
FROM student_course course1
INNER JOIN student_course course2
ON course1.student_id = course2.student_id
WHERE course1.course = 'MTH251'
AND course2.course = 'MTH252';

Fiddle here. Sorry, the PostgreSQL fiddle wasn't working for me so I used Oracle for testing. The PostgreSQL equivalent should look roughly the same.

Now say I wanted to see a student who may not have taken MTH252. You could do this:

SELECT course1.student_id
, course1.course AS course1
, course1.grade AS grade1
, course2.course AS course2
, course2.grade AS grade2
FROM student_course course1
LEFT OUTER JOIN student_course course2
ON course1.student_id = course2.student_id
AND course2.course = 'MTH252'
WHERE course1.course = 'MTH251';

Other Fiddle

The former displays students who have taken BOTH MTH251 and MTH252, and the latter shows students who have taken MTH251, regardless of their completion of MTH252.

As noted by Nick.McDermaid, a self join works exactly like joining two tables with different data.



Related Topics



Leave a reply



Submit