How to get matching data from another SQL table for two different columns: Inner Join and/or Union?
(The following applies when every row is SQL DISTINCT, and outside SQL code similarly treats NULL like just another value.)
Every base table has a statement template, aka predicate, parameterized by column names, by which we put a row in or leave it out. We can use a (standard predicate logic) shorthand for the predicate that is like its SQL declaration.
-- facilitator [facilID] is named [facilFname] [facilLname]
facilitator(facilID, facilLname, facilFname)
-- class [classID] named [className] has prime [primeFacil] & backup [secondFacil]
class(classID, className, primeFacil, secondFacil)
Plugging a row into a predicate gives a statement aka proposition. The rows that make a true proposition go in a table and the rows that make a false proposition stay out. (So a table states the proposition of each present row and states NOT the proposition of each absent row.)
-- facilitator f1 is named Jane Doe
facilitator(f1, 'Jane', 'Doe')
-- class c1 named CSC101 has prime f1 & backup f8
class(c1, 'CSC101', f1, f8)
But every table expression value has a predicate per its expression. SQL is designed so that if tables T
and U
hold the (NULL-free non-duplicate) rows where T(...) and U(...) (respectively) then:
T CROSS JOIN U
holds rows where T(...) AND U(...)T INNER JOIN U ON
condition
holds rows where T(...) AND U(...) AND conditionT LEFT JOIN U ON
condition
holds rows where (for U-only columns U1,...)
T(...) AND U(...) AND condition
OR T(...)
AND NOT there EXISTS values for U1,... where [U(...) AND condition]
AND U1 IS NULL AND ...T WHERE
condition
holds rows where T(...) AND conditionT INTERSECT U
holds rows where T(...) AND U(...)T UNION U
holds rows where T(...) OR U(...)T EXCEPT U
holds rows where T(...) AND NOT U(...)SELECT DISTINCT * FROM T
holds rows where T(...)SELECT DISTINCT
columns to keep
FROM T
holds rows where
there EXISTS values for columns to drop where T(...)VALUES (C1, C2, ...)((
v1
,
v2
, ...), ...)
holds rows where
C1 = v1 AND C2 = v2 AND ... OR ...
Also:
(...) IN T
means T(...)scalar
= T
means T(scalar)- T(..., X, ...) AND X = Y means T(..., Y, ...) AND X = Y
So to query we find a way of phrasing the predicate for the rows that we want in natural language using base table predicates, then in shorthand using base table predicates, then in shorthand using aliases in column names except for output columns, then in SQL using base table names plus ON & WHERE conditions etc. If we need to mention a base table twice then we give it aliases.
-- natural language
there EXISTS values for classID, primeFacil & secondFacil where
class [classID] named [className]
has prime [primeFacil] & backup [secondFacil]
AND facilitator [primeFacil] is named [pf.facilFname] [pf.facilLname]
AND facilitator [secondFacil] is named [sf.facilFname] [sf.facilLname]
-- shorthand
there EXISTS values for classID, primeFacil & secondFacil where
class(classID, className, primeFacil, secondFacil)
AND facilitator(pf.facilID, pf.facilLname, pf.facilFname)
AND pf.facilID = primeFacil
AND facilitator(sf.facilID, sf.facilLname, sf.facilFname)
AND sf.facilID = secondFacil
-- shorthand using aliases everywhere but result
-- use # to distinguish same-named result columns in specification
there EXISTS values for c.*, pf.*, sf.* where
className = c.className
AND facilLname#1 = pf.facilLname AND facilFname#1 = pf.facilFname
AND facilLname#2 = sf.facilLname AND facilFname#2 = sf.facilFname
AND class(c.classID, c.className, c.primeFacil, c.secondFacil)
AND facilitator(pf.facilID, pf.facilLname, pf.facilFname)
AND pf.facilID = c.primeFacil
AND facilitator(sf.facilID, sf.facilLname, sf.facilFname)
AND sf.facilID = c.secondFacil
-- table names & SQL (with MS Access parentheses)
SELECT className, pf.facilLname, pf.facilFname, sf.facilLname, sf.facilFname
FROM (class JOIN facilitator AS pf ON pf.facilID = primeFacil)
JOIN facilitator AS sf ON sf.facilID = secondFacil
OUTER JOIN would be used when a class doesn't always have both facilitators or something doesn't always have all names. (Ie if a column can be NULL.) But you haven't given the specific predicates for your base table and query or the business rules about when things might be NULL so I have assumed no NULLs.
Is there any rule of thumb to construct SQL query from a human-readable description?
(Re MS Access JOIN parentheses see this from SO and this from MS.)
Unioning two tables with different number of columns
Add extra columns as null for the table having less columns like
Select Col1, Col2, Col3, Col4, Col5 from Table1
Union
Select Col1, Col2, Col3, Null as Col4, Null as Col5 from Table2
How can I merge the columns from two tables into one output?
Specifying the columns on your query should do the trick:
select a.col1, b.col2, a.col3, b.col4, a.category_id
from items_a a, items_b b
where a.category_id = b.category_id
should do the trick with regards to picking the columns you want.
To get around the fact that some data is only in items_a and some data is only in items_b, you would be able to do:
select
coalesce(a.col1, b.col1) as col1,
coalesce(a.col2, b.col2) as col2,
coalesce(a.col3, b.col3) as col3,
a.category_id
from items_a a, items_b b
where a.category_id = b.category_id
The coalesce function will return the first non-null value, so for each row if col1 is non null, it'll use that, otherwise it'll get the value from col2, etc.
get results from multiple tables using union or left join
Seems like query with Union All is faster than the Query with Left joins (at least for this scenario).
Left join query runs full scan three times (with nested loops)
But using Union all there are only two table scans
How to do a mysql select query on a table with two columns of foreign keys that relate to another table of names
You must join wp_divisions
with wp_players
twice:
select
d.Div_id,
p1.display_name player1,
p2.display_name player2
from wp_divisions d
inner join wp_players p1 on p1.ID = d.div_player1_id
inner join wp_players p2 on p2.ID = d.div_player2_id
If there is a case that div_player1_id
or div_player2_id
is null
then use left
joins instead of inner
joins.
How to make two joins between two tables in MySQL such that they are interlinked to each other?
How to proceed towards the solution
Two joins can be formed between two columns by using Table aliases. As the question specifies, that one join is to be formed between the employee
and the branch
table, and another join needs to be formed between the branch
and the employee
table. The little bit tricky part of these types of joins is the relation specified after the ON
keyword that joins the two tables.
As @philipxy writes in a comment to this question:
Constraints (including FKs & PKs) need not hold, be declared or be known in order to record or query. Joins are binary, the left table is the result of any previous joins in a series without parentheses. Except for output column order, inner & cross joins have no direction, t join u on c is u join t on c.
So according to the comment, we would form a join between employee
and branch
and another join between employee
and an alias of branch table called branch2
. The common confusion here is that most people(including me earlier) think that there is a "direction" of joins, the thing that philipxy covers in his aforementioned comment.
The solution to the problem
You can write a SQL query which queries the first_name
, last_name
and branch_id
from the employee
table and the branch_name
from the branch
table and forms a join between the two tables on the basis of branch_id
. You have to query the mgr_id
from the alias of the branch table called branch2
; you have to query the first_name
and the last_name
of the branch managers from the employee table. You can easily join the employee and the branch table on the basis of emp_id
such that the mgr_id
=emp_id
.
You can finally write the SQL query for the problem like this:
SELECT employee.first_name, employee.last_name, employee.branch_id,
branch.branch_name,
branch2.mgr_id, employee.first_name AS manager_first_name, employee.last_name AS manager_last_name
FROM employee
JOIN branch ON employee.branch_id=branch.branch_id
JOIN branch branch2 ON branch2.mgr_id=employee.emp_id;
Extra information
The above mentioned query would return this:
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
| first_name | last_name | branch_id | branch_name | mgr_id | manager_first_name | manager_last_name |
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
| David | Wallace | 1 | Corporate | 100 | David | Wallace |
| Michael | Scott | 2 | Scranton | 102 | Michael | Scott |
| Josh | Porter | 3 | Stamford | 106 | Josh | Porter |
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
These results might look useless as we have formed an INNER JOIN
between the tables so it just returns us the name of the employees who are "managers" of a specific branch. If you form a LEFT JOIN
between the tables instead of an INNER JOIN
you would get results like this:
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
| first_name | last_name | branch_id | branch_name | mgr_id | manager_first_name | manager_last_name |
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
| David | Wallace | 1 | Corporate | 100 | David | Wallace |
| Jan | Levinson | 1 | Corporate | NULL | Jan | Levinson |
| Michael | Scott | 2 | Scranton | 102 | Michael | Scott |
| Angela | Martin | 2 | Scranton | NULL | Angela | Martin |
| Kelly | Kapoor | 2 | Scranton | NULL | Kelly | Kapoor |
| Stanley | Hudson | 2 | Scranton | NULL | Stanley | Hudson |
| Josh | Porter | 3 | Stamford | 106 | Josh | Porter |
| Andy | Bernard | 3 | Stamford | NULL | Andy | Bernard |
| Jim | Halpert | 3 | Stamford | NULL | Jim | Halpert |
+------------+-----------+-----------+-------------+--------+--------------------+-------------------+
These results were not as expected as the employees who are not managers of any branch just have a mgr_id
with NULL
value whereas the branch that they word in actually has a manager. With the mgr_id
being NULL, the manager_first_name
and manager_last_name
have unexpected results too.
The above occurs because we cannot have the same manager for two employees because mgr_id
can not be the same accross rows as it is the emp_id
which is the PRIMARY KEY of the employee
table.
Credits
- @philpxy 's comments on this question
Related Topics
Count the Occurrences of Distinct Values
SQL Server Process Queue Race Condition
How to Create a Comma-Separated List Using a SQL Query
How to Use Count and Group by At the Same Select Statement
Null in MySQL (Performance & Storage)
Why Isn't SQL Ansi-92 Standard Better Adopted Over Ansi-89
Delete Duplicate Rows from Small Table
How to Combine Date from One Field With Time from Another Field - Ms SQL Server
How to Update Only One Field Using Entity Framework
Group by Minimum Value in One Field While Selecting Distinct Rows
Function Vs. Stored Procedure in SQL Server
Using an Alias in a Where Clause