SQL Join: Is There a Difference Between Using, on or Where

SQL JOIN: is there a difference between USING, ON or WHERE?

There is no difference in performance.

However, the first style is ANSI-89 and will get your legs broken in some shops. Including mine. The second style is ANSI-92 and is much clearer.

Examples:

Which is the JOIN, which is the filter?

FROM T1,T2,T3....
WHERE T1.ID = T2.ID AND
T1.foo = 'bar' AND T2.fish = 42 AND
T1.ID = T3.ID

FROM T1
INNER JOIN T2 ON T1.ID = T2.ID
INNER JOIN T3 ON T1.ID = T3.ID
WHERE
T1.foo = 'bar' AND T2.fish = 42

If you have OUTER JOINs (=*, *=) then the 2nd style will work as advertised. The first most likely won't and is also deprecated in SQL Server 2005+

The ANSI-92 style is harder to bollix too. With the older style you can easily end up with a Cartesian product (cross join) if you miss a condition. You'll get a syntax error with ANSI-92.

Edit: Some more clarification

  • The reason for not using "join the where" (implicit) is the dodgy results with outer joins.
  • If you use explicit OUTER JOINs + implicit INNER JOINs you'll still get dodgy results + you have inconsistency in usage

It isn't just syntax: it's about having a semantically correct query

Edit, Dec 2011

SQL Server logical query processing order is FROM, ON, JOIN, WHERE...

So if you mix "implicit WHERE inner joins" and "explicit FROM outer joins" you most likely won't get expected results because the query is ambiguous...

SQL JOIN - WHERE clause vs. ON clause

They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.

INNER JOIN ON vs WHERE clause

INNER JOIN is ANSI syntax that you should use.

It is generally considered more readable, especially when you join lots of tables.

It can also be easily replaced with an OUTER JOIN whenever a need arises.

The WHERE syntax is more relational model oriented.

A result of two tables JOINed is a cartesian product of the tables to which a filter is applied which selects only those rows with joining columns matching.

It's easier to see this with the WHERE syntax.

As for your example, in MySQL (and in SQL generally) these two queries are synonyms.

Also, note that MySQL also has a STRAIGHT_JOIN clause.

Using this clause, you can control the JOIN order: which table is scanned in the outer loop and which one is in the inner loop.

You cannot control this in MySQL using WHERE syntax.

Difference between and and where in joins

Firstly there is a semantic difference. When you have a join, you are saying that the relationship between the two tables is defined by that condition. So in your first example you are saying that the tables are related by cd.Company = table2.Name AND table2.Id IN (2728). When you use the WHERE clause, you are saying that the relationship is defined by cd.Company = table2.Name and that you only want the rows where the condition table2.Id IN (2728) applies. Even though these give the same answer, it means very different things to a programmer reading your code.

In this case, the WHERE clause is almost certainly what you mean so you should use it.

Secondly there is actually difference in the result in the case that you use a LEFT JOIN instead of an INNER JOIN. If you include the second condition as part of the join, you will still get a result row if the condition fails - you will get values from the left table and nulls for the right table. If you include the condition as part of the WHERE clause and that condition fails, you won't get the row at all.

Here is an example to demonstrate this.

Query 1 (WHERE):

SELECT DISTINCT field1
FROM table1 cd
LEFT JOIN table2
ON cd.Company = table2.Name
WHERE table2.Id IN (2728);

Result:

field1
200

Query 2 (AND):

SELECT DISTINCT field1
FROM table1 cd
LEFT JOIN table2
ON cd.Company = table2.Name
AND table2.Id IN (2728);

Result:

field1
100
200

Test data used:

CREATE TABLE table1 (Company NVARCHAR(100) NOT NULL, Field1 INT NOT NULL);
INSERT INTO table1 (Company, Field1) VALUES
('FooSoft', 100),
('BarSoft', 200);

CREATE TABLE table2 (Id INT NOT NULL, Name NVARCHAR(100) NOT NULL);
INSERT INTO table2 (Id, Name) VALUES
(2727, 'FooSoft'),
(2728, 'BarSoft');

In SQL server what is the difference between using = and join?

The difference between what you are doing (in your first example) and what your professor is doing is that you are creating a set of all possible combinations of the rows in those tables, then narrowing your results to the ones that match the way you want them to. He is creating a set of only the rows that match the way you want them to in the first place.

If your tables were:

Table1
ID1
1
2
3

Table2
ID2
1
2
3

Your query starts with basically a cross join:

Select * from Table1, Table2

ID1 ID2
1 1
2 1
3 1
1 2
2 2
3 2
1 3
2 3
3 3

Then narrows that result set down by applying the where ID1 = ID2

ID1  ID2
1 1
2 2
3 3

This is inefficient and somewhat difficult to read in more complex examples, as people have mentioned in the comments.

Your professor is building the criteria to relate the two tables into the join itself, so he is effectively skipping the first step. In our example tables, this would be Select * from Table1 join Table2 on ID1 = ID2.

There are several types of joins in SQL, which differ based on how you want to handle cases where a value exists in one of your tables, but has no match in the other table. See traditional venn diagram explanation from http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins: Traditional Venn Diagram

Difference between where and and clause in join sql query

Base on the following two test result

select a.id, a.name,a.country from table a left join table b
on a.id = b.id
where a.name is not null

is faster (237 Vs 460). As far as I know, it is a standard.

Sample Image

Sample Image

Difference between inner join and where in select join SQL statement

One difference is that the first option hides the intent by expressing the join condition in the where clause.

The second option, where the join condition is written out is more clear for the user reading the query. It shows the exact intent of the query.

As far as performance or any other difference, there shouldn't be any. Both queries should return the exact same result and perform the same under most RDBMS.

Difference between filtering queries in JOIN and WHERE?

The answer is NO difference, but:

I will always prefer to do the following.

  • Always keep the Join Conditions in ON clause
  • Always put the filter's in where clause

This makes the query more readable.

So I will use this query:

SELECT value
FROM table1
INNER JOIN table2
ON table1.id = table2.id
WHERE table1.id = 1

However when you are using OUTER JOIN'S there is a big difference in keeping the filter in the ON condition and Where condition.

Logical Query Processing

The following list contains a general form of a query, along with step numbers assigned according to the order in which the different clauses are logically processed.

(5) SELECT (5-2) DISTINCT (5-3) TOP(<top_specification>) (5-1) <select_list>
(1) FROM (1-J) <left_table> <join_type> JOIN <right_table> ON <on_predicate>
| (1-A) <left_table> <apply_type> APPLY <right_table_expression> AS <alias>
| (1-P) <left_table> PIVOT(<pivot_specification>) AS <alias>
| (1-U) <left_table> UNPIVOT(<unpivot_specification>) AS <alias>
(2) WHERE <where_predicate>
(3) GROUP BY <group_by_specification>
(4) HAVING <having_predicate>
(6) ORDER BY <order_by_list>;

Flow diagram logical query processing

Sample Image

  • (1) FROM: The FROM phase identifies the query’s source tables and
    processes table operators. Each table operator applies a series of
    sub phases. For example, the phases involved in a join are (1-J1)
    Cartesian product, (1-J2) ON Filter, (1-J3) Add Outer Rows. The FROM
    phase generates virtual table VT1.

  • (1-J1) Cartesian Product: This phase performs a Cartesian product
    (cross join) between the two tables involved in the table operator,
    generating VT1-J1.

  • (1-J2) ON Filter: This phase filters the rows from VT1-J1 based on
    the predicate that appears in the ON clause (<on_predicate>). Only
    rows for which the predicate evaluates to TRUE are inserted into
    VT1-J2.
  • (1-J3) Add Outer Rows: If OUTER JOIN is specified (as opposed to
    CROSS JOIN or INNER JOIN), rows from the preserved table or tables
    for which a match was not found are added to the rows from VT1-J2 as
    outer rows, generating VT1-J3.
  • (2) WHERE: This phase filters the rows from VT1 based on the
    predicate that appears in the WHERE clause (). Only
    rows for which the predicate evaluates to TRUE are inserted into VT2.
  • (3) GROUP BY: This phase arranges the rows from VT2 in groups based
    on the column list specified in the GROUP BY clause, generating VT3.
    Ultimately, there will be one result row per group.
  • (4) HAVING: This phase filters the groups from VT3 based on the
    predicate that appears in the HAVING clause (<having_predicate>).
    Only groups for which the predicate evaluates to TRUE are inserted
    into VT4.
  • (5) SELECT: This phase processes the elements in the SELECT clause,
    generating VT5.
  • (5-1) Evaluate Expressions: This phase evaluates the expressions in
    the SELECT list, generating VT5-1.
  • (5-2) DISTINCT: This phase removes duplicate rows from VT5-1,
    generating VT5-2.
  • (5-3) TOP: This phase filters the specified top number or percentage
    of rows from VT5-2 based on the logical ordering defined by the ORDER
    BY clause, generating the table VT5-3.
  • (6) ORDER BY: This phase sorts the rows from VT5-3 according to the
    column list specified in the ORDER BY clause, generating the cursor
    VC6.

it is referred from book "T-SQL Querying (Developer Reference)"

What's the difference between using and on in table joins in MySQL?

I don't use the USING syntax, since

  1. most of my joins aren't suited to it (not the same fieldname that is being matched, and/or multiple matches in the join) and
  2. it isn't immediately obvious what it translates to in the case with more than two tables

ie assuming 3 tables with 'id' and 'id_2' columns, does

T1 JOIN T2 USING(id) JOIN T3 USING(id_2)

become

T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T1.id_2=T3.id_2 AND T2.id_2=T3.id_2)

or

T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T2.id_2=T3.id_2)

or something else again?

Finding this out for a particular database version is a fairly trivial exercise, but I don't have a large amount of confidence that it is consistent across all databases, and I'm not the only person that has to maintain my code (so the other people will also have to be aware of what it is equivalent to).

An obvious difference with the WHERE vs ON is if the join is outer:

Assuming a T1 with a single ID field, one row containing the value 1, and a T2 with an ID and VALUE field (one row, ID=1, VALUE=6), then we get:

SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID) WHERE T2.VALUE=42

gives no rows, since the WHERE is required to match, whereas

SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID AND T2.VALUE=42)

will give one row with the values

1, NULL, NULL

since the ON is only required for matching the join, which is optional due to being outer.



Related Topics



Leave a reply



Submit