Understanding how JOIN works when 3 or more tables are involved. [SQL]
Conceptually here is what happens when you join three tables together.
- The optimizer comes up with a plan, which includes a join order. It could be A, B, C, or C, B, A or any of the combinations
- The query execution engine applies any predicates (
WHERE
clause) to the first table that doesn't involve any of the other tables. It selects out the columns mentioned in theJOIN
conditions or theSELECT
list or theORDER BY
list. Call this result A - It joins this result set to the second table. For each row it joins to the second table, applying any predicates that may apply to the second table. This results in another temporary resultset.
- Then it joins in the final table and applies the
ORDER BY
This is conceptually what happens. Infact there are many possible optimizations along the way. The advantage of the relational model is that the sound mathematical basis makes various transformations of plan possible while not changing the correctness.
For example, there is really no need to generate the full result sets along the way. The ORDER BY
may instead be done via accessing the data using an index in the first place. There are lots of types of joins that can be done as well.
SQL Inner-join with 3 tables?
You can do the following (I guessed on table fields,etc)
SELECT s.studentname
, s.studentid
, s.studentdesc
, h.hallname
FROM students s
INNER JOIN hallprefs hp
on s.studentid = hp.studentid
INNER JOIN halls h
on hp.hallid = h.hallid
Based on your request for multiple halls you could do it this way. You just join on your Hall table multiple times for each room pref id:
SELECT s.StudentID
, s.FName
, s.LName
, s.Gender
, s.BirthDate
, s.Email
, r.HallPref1
, h1.hallName as Pref1HallName
, r.HallPref2
, h2.hallName as Pref2HallName
, r.HallPref3
, h3.hallName as Pref3HallName
FROM dbo.StudentSignUp AS s
INNER JOIN RoomSignUp.dbo.Incoming_Applications_Current AS r
ON s.StudentID = r.StudentID
INNER JOIN HallData.dbo.Halls AS h1
ON r.HallPref1 = h1.HallID
INNER JOIN HallData.dbo.Halls AS h2
ON r.HallPref2 = h2.HallID
INNER JOIN HallData.dbo.Halls AS h3
ON r.HallPref3 = h3.HallID
Joining multiple tables in SQL
When joining multiple tables the output of each join logically forms a virtual table that goes into the next join.
So in the example in your question the composite result of joining the first 5 tables would be treated as the left hand table.
See Itzik Ben-Gan's Logical Query Processing Poster for more about this.
The virtual tables involved in the joins can be controlled by positioning the ON
clause. For example
SELECT *
FROM T1
INNER JOIN T2
ON T2.C = T1.C
INNER JOIN T3
LEFT JOIN T4
ON T4.C = T3.C
ON T3.C = T2.C
is equivalent to (T1 Inner Join T2) Inner Join (T3 Left Join T4)
In SQL How to choose two of three or more tables to join based on a condition?
You can join tables with conditions. Just add the condition to your Join statement. Here is an example:
SELECT
...
COALESCE(A.Price, B.Price, C.Price)
...
FROM Product P
LEFT OUTER JOIN TableA A ON A.ProductId = P.ProductId AND YourConditionA
LEFT OUTER JOIN TableB B ON B.ProductId = P.ProductId AND YourConditionB
LEFT OUTER JOIN TableC C ON C.ProductId = P.ProductId AND YourConditionC
With COALESCE
you can select the first not null value.
SQL Inner join more than two tables
SELECT *
FROM table1
INNER JOIN table2
ON table1.primaryKey=table2.table1Id
INNER JOIN table3
ON table1.primaryKey=table3.table1Id
SQL joining three tables, join precedence
All kinds of outer and normal joins are in the same precedence class and operators take effect left-to-right at a given nesting level of the query. You can put the join expression on the right side in parentheses to cause it to take effect first. Remember that you will have to move the ON
clauses around so that they stay with their joins—the join in parentheses takes its ON
clause with it into the parentheses, so it now comes textually before the other ON
clause which will be after the parentheses in the outer join statement.
(PostgreSQL example)
In
SELECT * FROM a LEFT JOIN b ON (a.id = b.id) JOIN c ON (b.ref = c.id);
the a-b join takes effect first, but we can force the b-c join to take effect first by putting it in parentheses, which looks like:
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
Often you can express the same thing without extra parentheses by moving the joins around and changing the direction of the outer joins, e.g.
SELECT * FROM b JOIN c ON (b.ref = c.id) RIGHT JOIN a ON (a.id = b.id);
SQL Join: Are selects between more than 2 tables still joins?
Inner joins and outer joins are perfectly reasonable to use with more than 2 tables.
Inner joins force the result to display only data that has whatever row you joined on, whereas outer joins display all data no matter what.
Let us say you wanted to join 4 tables together...
select * from testtable
inner join testable2 on col1 = othercolumn
inner join testable3 on col2 = othercolumn
leftjoin testable4 on col3 = othercolumn
In this case, it would return only results that existed in the inner joins, but the result would not have to exist in the outside/left join. You are forcing testtables 2 & 3 to have a value on what you are joining on.. it cannot be null.
The left join does not care if the value is null, and will show results anyway.
I hope this helps some... Basically.. if you inner join on a value, and it can possibly be null, then the entire query will show blank. This is the scenario you would use an outter join.. you are not forcing the value to exist.
SQL JOIN and different types of JOINs
What is SQL JOIN
?
SQL JOIN
is a method to retrieve data from two or more database tables.
What are the different SQL JOIN
s ?
There are a total of five JOIN
s. They are :
1. JOIN or INNER JOIN
2. OUTER JOIN
2.1 LEFT OUTER JOIN or LEFT JOIN
2.2 RIGHT OUTER JOIN or RIGHT JOIN
2.3 FULL OUTER JOIN or FULL JOIN
3. NATURAL JOIN
4. CROSS JOIN
5. SELF JOIN
1. JOIN or INNER JOIN :
In this kind of a JOIN
, we get all records that match the condition in both tables, and records in both tables that do not match are not reported.
In other words, INNER JOIN
is based on the single fact that: ONLY the matching entries in BOTH the tables SHOULD be listed.
Note that a JOIN
without any other JOIN
keywords (like INNER
, OUTER
, LEFT
, etc) is an INNER JOIN
. In other words, JOIN
is
a Syntactic sugar for INNER JOIN
(see: Difference between JOIN and INNER JOIN).
2. OUTER JOIN :
OUTER JOIN
retrieves
Either,
the matched rows from one table and all rows in the other table
Or,
all rows in all tables (it doesn't matter whether or not there is a match).
There are three kinds of Outer Join :
2.1 LEFT OUTER JOIN or LEFT JOIN
This join returns all the rows from the left table in conjunction with the matching rows from the
right table. If there are no columns matching in the right table, it returns NULL
values.
2.2 RIGHT OUTER JOIN or RIGHT JOIN
This JOIN
returns all the rows from the right table in conjunction with the matching rows from the
left table. If there are no columns matching in the left table, it returns NULL
values.
2.3 FULL OUTER JOIN or FULL JOIN
This JOIN
combines LEFT OUTER JOIN
and RIGHT OUTER JOIN
. It returns rows from either table when the conditions are met and returns NULL
value when there is no match.
In other words, OUTER JOIN
is based on the fact that: ONLY the matching entries in ONE OF the tables (RIGHT or LEFT) or BOTH of the tables(FULL) SHOULD be listed.
Note that `OUTER JOIN` is a loosened form of `INNER JOIN`.
3. NATURAL JOIN :
It is based on the two conditions :
- the
JOIN
is made on all the columns with the same name for equality. - Removes duplicate columns from the result.
This seems to be more of theoretical in nature and as a result (probably) most DBMS
don't even bother supporting this.
4. CROSS JOIN :
It is the Cartesian product of the two tables involved. The result of a CROSS JOIN
will not make sense
in most of the situations. Moreover, we won't need this at all (or needs the least, to be precise).
5. SELF JOIN :
It is not a different form of JOIN
, rather it is a JOIN
(INNER
, OUTER
, etc) of a table to itself.
JOINs based on Operators
Depending on the operator used for a JOIN
clause, there can be two types of JOIN
s. They are
- Equi JOIN
- Theta JOIN
1. Equi JOIN :
For whatever JOIN
type (INNER
, OUTER
, etc), if we use ONLY the equality operator (=), then we say that
the JOIN
is an EQUI JOIN
.
2. Theta JOIN :
This is same as EQUI JOIN
but it allows all other operators like >, <, >= etc.
Many consider both
EQUI JOIN
and ThetaJOIN
similar toINNER
,OUTER
etcJOIN
s. But I strongly believe that its a mistake and makes the
ideas vague. BecauseINNER JOIN
,OUTER JOIN
etc are all connected with
the tables and their data whereasEQUI JOIN
andTHETA JOIN
are only
connected with the operators we use in the former.Again, there are many who consider
NATURAL JOIN
as some sort of
"peculiar"EQUI JOIN
. In fact, it is true, because of the first
condition I mentioned forNATURAL JOIN
. However, we don't have to
restrict that simply toNATURAL JOIN
s alone.INNER JOIN
s,OUTER JOIN
s
etc could be anEQUI JOIN
too.
Three table join with joins other than INNER JOIN
Yes, I do use all three of those JOINs, although I tend to stick to using just LEFT (OUTER) JOIN
s instead of inter-mixing LEFT and RIGHT JOINs. I also use FULL OUTER JOIN
s and CROSS JOIN
s.
In summary, an INNER JOIN
restricts the resultset only to those records satisfied by the JOIN condition. Consider the following tables
EDIT: I've renamed the Table names and prefix them with @
so that Table Variables can be used for anyone reading this answer and wanting to experiment.
If you'd also like to experiment with this in the browser, I've set this all up on SQL Fiddle too;
@Table1
id | name
---------
1 | One
2 | Two
3 | Three
4 | Four
@Table2
id | name
---------
1 | Partridge
2 | Turtle Doves
3 | French Hens
5 | Gold Rings
SQL code
DECLARE @Table1 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO @Table1 VALUES(1, 'One');
INSERT INTO @Table1 VALUES(2, 'Two');
INSERT INTO @Table1 VALUES(3, 'Three');
INSERT INTO @Table1 VALUES(4, 'Four');
DECLARE @Table2 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO @Table2 VALUES(1, 'Partridge');
INSERT INTO @Table2 VALUES(2, 'Turtle Doves');
INSERT INTO @Table2 VALUES(3, 'French Hens');
INSERT INTO @Table2 VALUES(5, 'Gold Rings');
An INNER JOIN
SQL Statement, joined on the id
field
SELECT
t1.id,
t1.name,
t2.name
FROM
@Table1 t1
INNER JOIN
@Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
A LEFT JOIN
will return a resultset with all records from the table on the left hand side of the join (if you were to write out the statement as a one liner, the table that appears first) and fields from the table on the right side of the join that match the join expression and are included in the SELECT
clause. Missing details will be populated with NULL
SELECT
t1.id,
t1.name,
t2.name
FROM
@Table1 t1
LEFT JOIN
@Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
4 | Four | NULL
A RIGHT JOIN
is the same logic as a LEFT JOIN
but will return all records from the right-hand side of the join and fields from the left side that match the join expression and are included in the SELECT
clause.
SELECT
t1.id,
t1.name,
t2.name
FROM
@Table1 t1
RIGHT JOIN
@Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
NULL| NULL| Gold Rings
Of course, there is also the FULL OUTER JOIN
, which includes records from both joined tables and populates any missing details with NULL.
SELECT
t1.id,
t1.name,
t2.name
FROM
@Table1 t1
FULL OUTER JOIN
@Table2 t2
ON
t1.id = t2.id
Results in
id | name | name
----------------
1 | One | Partridge
2 | Two | Turtle Doves
3 | Three| French Hens
4 | Four | NULL
NULL| NULL| Gold Rings
And a CROSS JOIN
(also known as a CARTESIAN PRODUCT
), which is simply the product of cross applying fields in the SELECT
statement from one table with the fields in the SELECT
statement from the other table. Notice that there is no join expression in a CROSS JOIN
SELECT
t1.id,
t1.name,
t2.name
FROM
@Table1 t1
CROSS JOIN
@Table2 t2
Results in
id | name | name
------------------
1 | One | Partridge
2 | Two | Partridge
3 | Three | Partridge
4 | Four | Partridge
1 | One | Turtle Doves
2 | Two | Turtle Doves
3 | Three | Turtle Doves
4 | Four | Turtle Doves
1 | One | French Hens
2 | Two | French Hens
3 | Three | French Hens
4 | Four | French Hens
1 | One | Gold Rings
2 | Two | Gold Rings
3 | Three | Gold Rings
4 | Four | Gold Rings
EDIT:
Imagine there is now a Table3
@Table3
id | name
---------
2 | Prime 1
3 | Prime 2
5 | Prime 3
The SQL code
DECLARE @Table3 TABLE (id INT PRIMARY KEY CLUSTERED, [name] VARCHAR(25))
INSERT INTO @Table3 VALUES(2, 'Prime 1');
INSERT INTO @Table3 VALUES(3, 'Prime 2');
INSERT INTO @Table3 VALUES(5, 'Prime 3');
Now all three tables joined with INNER JOINS
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
@Table1 t1
INNER JOIN
@Table2 t2
ON
t1.id = t2.id
INNER JOIN
@Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
It might help to understand this result by thinking that records with id 2 and 3 are the only ones common to all 3 tables and are also the field we are joining each table on.
Now all three with LEFT JOINS
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
@Table1 t1
LEFT JOIN
@Table2 t2
ON
t1.id = t2.id
LEFT JOIN
@Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
1 | One | Partridge | NULL
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
4 | Four | NULL | NULL
Joel's answer is a good explanation for explaining this resultset (Table1 is the base/origin table).
Now with a INNER JOIN
and a LEFT JOIN
SELECT
t1.id,
t1.name,
t2.name,
t3.name
FROM
@Table1 t1
INNER JOIN
@Table2 t2
ON
t1.id = t2.id
LEFT JOIN
@Table3 t3
ON
t1.id = t3.id
Results in
id | name | name | name
-------------------------------
1 | One | Partridge | NULL
2 | Two | Turtle Doves | Prime 1
3 | Three| French Hens | Prime 2
Although we do not know the order in which the query optimiser will perform the operations, we will look at this query from top to bottom to understand the resultset. The INNER JOIN
on ids between Table1 and Table2 will restrict the resultset to only those records satisfied by the join condition i.e. the three rows that we saw in the very first example. This temporary resultset will then be LEFT JOIN
ed to Table3 on ids between Table1 and Tables; There are records in Table3 with id 2 and 3, but not id 1, so t3.name field will have details in for 2 and 3 but not 1.
Related Topics
Get All Dates in Date Range in SQL Server
Do Conditional Insert with SQL
Executing SQL Server Agent Job from a Stored Procedure and Returning Job Result
Space Used by Nulls in Database
Tsql Select into Temp Table from Dynamic SQL
Return All Possible Combinations of Values Within a Single Column in SQL
How to Dynamically Use Tg_Table_Name in Postgresql 8.2
Safest Way to Get Last Record Id from a Table
SQL Query with Distinct and Sum
Adding 'Go' Statements to Entity Framework Migrations
Linq to SQL: How to Stop the Auto Generated Object Name from Being Renamed
Select Closest Numerical Value with MySQL Query
Role of Selectivity in Index Scan/Seek
Google Big Query SQL - Get Most Recent Column Value