SQL JOIN: is there a difference between USING, ON or WHERE?
There is no difference in performance.
However, the first style is ANSI-89 and will get your legs broken in some shops. Including mine. The second style is ANSI-92 and is much clearer.
Examples:
Which is the JOIN, which is the filter?
FROM T1,T2,T3....
WHERE T1.ID = T2.ID AND
T1.foo = 'bar' AND T2.fish = 42 AND
T1.ID = T3.ID
FROM T1
INNER JOIN T2 ON T1.ID = T2.ID
INNER JOIN T3 ON T1.ID = T3.ID
WHERE
T1.foo = 'bar' AND T2.fish = 42
If you have OUTER JOINs (=*
, *=
) then the 2nd style will work as advertised. The first most likely won't and is also deprecated in SQL Server 2005+
The ANSI-92 style is harder to bollix too. With the older style you can easily end up with a Cartesian product (cross join) if you miss a condition. You'll get a syntax error with ANSI-92.
Edit: Some more clarification
- The reason for not using "join the where" (implicit) is the dodgy results with outer joins.
- If you use explicit OUTER JOINs + implicit INNER JOINs you'll still get dodgy results + you have inconsistency in usage
It isn't just syntax: it's about having a semantically correct query
Edit, Dec 2011
SQL Server logical query processing order is FROM, ON, JOIN, WHERE...
So if you mix "implicit WHERE inner joins" and "explicit FROM outer joins" you most likely won't get expected results because the query is ambiguous...
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
. The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
INNER JOIN ON vs WHERE clause
INNER JOIN
is ANSI syntax that you should use.
It is generally considered more readable, especially when you join lots of tables.
It can also be easily replaced with an OUTER JOIN
whenever a need arises.
The WHERE
syntax is more relational model oriented.
A result of two tables JOIN
ed is a cartesian product of the tables to which a filter is applied which selects only those rows with joining columns matching.
It's easier to see this with the WHERE
syntax.
As for your example, in MySQL (and in SQL generally) these two queries are synonyms.
Also, note that MySQL also has a STRAIGHT_JOIN
clause.
Using this clause, you can control the JOIN
order: which table is scanned in the outer loop and which one is in the inner loop.
You cannot control this in MySQL using WHERE
syntax.
Difference between and and where in joins
Firstly there is a semantic difference. When you have a join, you are saying that the relationship between the two tables is defined by that condition. So in your first example you are saying that the tables are related by cd.Company = table2.Name AND table2.Id IN (2728)
. When you use the WHERE clause, you are saying that the relationship is defined by cd.Company = table2.Name
and that you only want the rows where the condition table2.Id IN (2728)
applies. Even though these give the same answer, it means very different things to a programmer reading your code.
In this case, the WHERE clause is almost certainly what you mean so you should use it.
Secondly there is actually difference in the result in the case that you use a LEFT JOIN instead of an INNER JOIN. If you include the second condition as part of the join, you will still get a result row if the condition fails - you will get values from the left table and nulls for the right table. If you include the condition as part of the WHERE clause and that condition fails, you won't get the row at all.
Here is an example to demonstrate this.
Query 1 (WHERE):
SELECT DISTINCT field1
FROM table1 cd
LEFT JOIN table2
ON cd.Company = table2.Name
WHERE table2.Id IN (2728);
Result:
field1
200
Query 2 (AND):
SELECT DISTINCT field1
FROM table1 cd
LEFT JOIN table2
ON cd.Company = table2.Name
AND table2.Id IN (2728);
Result:
field1
100
200
Test data used:
CREATE TABLE table1 (Company NVARCHAR(100) NOT NULL, Field1 INT NOT NULL);
INSERT INTO table1 (Company, Field1) VALUES
('FooSoft', 100),
('BarSoft', 200);
CREATE TABLE table2 (Id INT NOT NULL, Name NVARCHAR(100) NOT NULL);
INSERT INTO table2 (Id, Name) VALUES
(2727, 'FooSoft'),
(2728, 'BarSoft');
In SQL server what is the difference between using = and join?
The difference between what you are doing (in your first example) and what your professor is doing is that you are creating a set of all possible combinations of the rows in those tables, then narrowing your results to the ones that match the way you want them to. He is creating a set of only the rows that match the way you want them to in the first place.
If your tables were:
Table1
ID1
1
2
3
Table2
ID2
1
2
3
Your query starts with basically a cross join:
Select * from Table1, Table2
ID1 ID2
1 1
2 1
3 1
1 2
2 2
3 2
1 3
2 3
3 3
Then narrows that result set down by applying the where ID1 = ID2
ID1 ID2
1 1
2 2
3 3
This is inefficient and somewhat difficult to read in more complex examples, as people have mentioned in the comments.
Your professor is building the criteria to relate the two tables into the join itself, so he is effectively skipping the first step. In our example tables, this would be Select * from Table1 join Table2 on ID1 = ID2
.
There are several types of joins in SQL, which differ based on how you want to handle cases where a value exists in one of your tables, but has no match in the other table. See traditional venn diagram explanation from http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins:
Difference between where and and clause in join sql query
Base on the following two test result
select a.id, a.name,a.country from table a left join table b
on a.id = b.id
where a.name is not null
is faster (237 Vs 460). As far as I know, it is a standard.
Difference between inner join and where in select join SQL statement
One difference is that the first option hides the intent by expressing the join condition in the where clause.
The second option, where the join condition is written out is more clear for the user reading the query. It shows the exact intent of the query.
As far as performance or any other difference, there shouldn't be any. Both queries should return the exact same result and perform the same under most RDBMS.
Difference between filtering queries in JOIN and WHERE?
The answer is NO difference, but:
I will always prefer to do the following.
- Always keep the Join Conditions in
ON
clause - Always put the filter's in
where
clause
This makes the query more readable.
So I will use this query:
SELECT value
FROM table1
INNER JOIN table2
ON table1.id = table2.id
WHERE table1.id = 1
However when you are using OUTER JOIN'S
there is a big difference in keeping the filter in the ON
condition and Where
condition.
Logical Query Processing
The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.
(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;
Flow diagram logical query processing
(1) FROM: The FROM phase identifies the query’s source tables and
processes table operators. Each table operator applies a series of
sub phases. For example, the phases involved in a join are (1-J1)
Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM
phase generates virtual table VT1.(1-J1) Cartesian Product: This phase performs a Cartesian product
(cross join) between the two tables involved in the table operator,
generating VT1-J1.- (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on
the predicate that appears in the ON clause (<on_predicate>). Only
rows for which the predicate evaluates to TRUE are inserted into
VT1-J2. - (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to
CROSS JOIN or INNER JOIN), rows from the preserved table or tables
for which a match was not found are added to the rows from VT1-J2 as
outer rows, generating VT1-J3. - (2) WHERE: This phase filters the rows from VT1 based on the
predicate that appears in the WHERE clause (). Only
rows for which the predicate evaluates to TRUE are inserted into VT2. - (3) GROUP BY: This phase arranges the rows from VT2 in groups based
on the column list specified in the GROUP BY clause, generating VT3.
Ultimately, there will be one result row per group. - (4) HAVING: This phase filters the groups from VT3 based on the
predicate that appears in the HAVING clause (<having_predicate>).
Only groups for which the predicate evaluates to TRUE are inserted
into VT4. - (5) SELECT: This phase processes the elements in the SELECT clause,
generating VT5. - (5-1) Evaluate Expressions: This phase evaluates the expressions in
the SELECT list, generating VT5-1. - (5-2) DISTINCT: This phase removes duplicate rows from VT5-1,
generating VT5-2. - (5-3) TOP: This phase filters the specified top number or percentage
of rows from VT5-2 based on the logical ordering defined by the ORDER
BY clause, generating the table VT5-3. - (6) ORDER BY: This phase sorts the rows from VT5-3 according to the
column list specified in the ORDER BY clause, generating the cursor
VC6.
it is referred from book "T-SQL Querying (Developer Reference)"
What's the difference between using and on in table joins in MySQL?
I don't use the USING syntax, since
- most of my joins aren't suited to it (not the same fieldname that is being matched, and/or multiple matches in the join) and
- it isn't immediately obvious what it translates to in the case with more than two tables
ie assuming 3 tables with 'id' and 'id_2' columns, does
T1 JOIN T2 USING(id) JOIN T3 USING(id_2)
become
T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T1.id_2=T3.id_2 AND T2.id_2=T3.id_2)
or
T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T2.id_2=T3.id_2)
or something else again?
Finding this out for a particular database version is a fairly trivial exercise, but I don't have a large amount of confidence that it is consistent across all databases, and I'm not the only person that has to maintain my code (so the other people will also have to be aware of what it is equivalent to).
An obvious difference with the WHERE vs ON is if the join is outer:
Assuming a T1 with a single ID field, one row containing the value 1, and a T2 with an ID and VALUE field (one row, ID=1, VALUE=6), then we get:
SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID) WHERE T2.VALUE=42
gives no rows, since the WHERE is required to match, whereas
SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID AND T2.VALUE=42)
will give one row with the values
1, NULL, NULL
since the ON is only required for matching the join, which is optional due to being outer.
Related Topics
How to Select Rows With Max(Column Value), Partition by Another Column in MySQL
Left Outer Join Doesn't Return All Rows from My Left Table
Two SQL Left Joins Produce Incorrect Result
Optimal Way to Concatenate/Aggregate Strings
Insert Statement Conflicted With the Foreign Key Constraint - SQL Server
How to Get Column Names from a Table in SQL Server
How to Avoid the "Divide by Zero" Error in Sql
MySQL Results as Comma Separated List
How to Return Result of a Select Inside a Function in Postgresql
How to Limit the Number of Rows Returned by an Oracle Query After Ordering
How to Select the First Row of Each Group
How to Delete Duplicate Rows in SQL Server
What Is the Major Difference Between Varchar2 and Char
Error 1452: Cannot Add or Update a Child Row: a Foreign Key Constraint Fails