Difference Between on Clause and Using Clause in SQL

What's the difference between using and on in table joins in MySQL?

I don't use the USING syntax, since

  1. most of my joins aren't suited to it (not the same fieldname that is being matched, and/or multiple matches in the join) and
  2. it isn't immediately obvious what it translates to in the case with more than two tables

ie assuming 3 tables with 'id' and 'id_2' columns, does

T1 JOIN T2 USING(id) JOIN T3 USING(id_2)

become

T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T1.id_2=T3.id_2 AND T2.id_2=T3.id_2)

or

T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T2.id_2=T3.id_2)

or something else again?

Finding this out for a particular database version is a fairly trivial exercise, but I don't have a large amount of confidence that it is consistent across all databases, and I'm not the only person that has to maintain my code (so the other people will also have to be aware of what it is equivalent to).

An obvious difference with the WHERE vs ON is if the join is outer:

Assuming a T1 with a single ID field, one row containing the value 1, and a T2 with an ID and VALUE field (one row, ID=1, VALUE=6), then we get:

SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID) WHERE T2.VALUE=42

gives no rows, since the WHERE is required to match, whereas

SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID AND T2.VALUE=42)

will give one row with the values

1, NULL, NULL

since the ON is only required for matching the join, which is optional due to being outer.

SQL JOIN - WHERE clause vs. ON clause

They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345.

The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.

USING Keyword vs ON clause - MYSQL

The USING clause is something we don't need to mention in the JOIN condition when we are retrieving data from multiple tables. When we use a USING clause, that particular column name should be present in both tables, and the SELECT query will automatically join those tables using the given column name in the USING clause.

For example, if there are two common column names in the table, then mention the desired common column name in the USING clause.

USING is also used while executing Dynamic SQL, like so:

EXECUTE IMMEDIATE 'DELETE FROM dept WHERE deptno = :num'
USING dept_id;
  • The USING clause: This allows you to specify the join key by name.

  • The ON clause: This syntax allows you to specify the column names for join keys in both tables.

The USING clause

The USING clause is used if several columns share the same name but you don’t want to join using all of these common columns. The columns listed in the USING clause can’t have any qualifiers in the statement, including the WHERE clause.

The ON clause

The ON clause is used to join tables where the column names don’t match in both tables. The join conditions are removed from the filter conditions in the WHERE clause.

What's the difference between where clause and on clause when table left join?

The where clause applies to the whole resultset; the on clause only applies to the join in question.

In the example supplied, all of the additional conditions related to fields on the inner side of the join - so in this example, the two queries are effectively identical.

However, if you had included a condition on a value in the table in the outer side of the join, it would have made a significant difference.

You can get more from this link: http://ask.sqlservercentral.com/questions/80067/sql-data-filter-condition-in-join-vs-where-clause

For example:

select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 and t2.f4=1

select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 where t2.f4=1

- do different things - the former will left join to t2 records where f4 is 1, while the latter has effectively been turned back into an inner join to t2.

Difference between USING and ON in Oracle SQL

The difference for me is that you can paint yourself into a corner with the USING clause:

CREATE TABLE roster (mgrid INTEGER, empid INTEGER);
CREATE TABLE emp (empid INTEGER, NAME VARCHAR2(20));

INSERT INTO roster VALUES (1,10);
INSERT INTO roster VALUES (1,11);
INSERT INTO roster VALUES (1,12);
INSERT INTO roster VALUES (2,20);
INSERT INTO roster VALUES (2,21);

INSERT INTO emp VALUES (10, 'John');
INSERT INTO emp VALUES (11, 'Steve');
INSERT INTO emp VALUES (12, 'Mary');
INSERT INTO emp VALUES (20, 'Ann');
INSERT INTO emp VALUES (21, 'George');
INSERT INTO emp VALUES (1, 'Pete');
INSERT INTO emp VALUES (2, 'Sally');

SELECT r.mgrid, e2.name, e1.empid, e1.name
FROM roster r JOIN emp e1 USING(empid)
JOIN emp e2 ON r.mgrid = e2.empid;

In the above select, you get an ora-25154, "column part of USING clause cannot have a qualifier".

If you remove the e1.empid qualifier, as in:

SELECT r.mgrid, e2.name, empid, e1.name
FROM roster r JOIN emp e1 USING(empid)
JOIN emp e2 ON r.mgrid = e2.empid;

You get an ORA-00918 error, "column ambiguously defined".

You have to use:

SELECT r.mgrid, e2.name, e1.empid, e1.name
FROM roster r JOIN emp e1 ON r.empid = e1.empid
JOIN emp e2 ON r.mgrid = e2.empid;

The example is contrived, but when I was first exploring the join syntax I ran into this exact problem in a real situation. I have avoided the USING clause ever since. There is no advantage with the USING clause other than a few keystrokes.

MySQL JOIN ON vs USING?

It is mostly syntactic sugar, but a couple differences are noteworthy:

ON is the more general of the two. One can join tables ON a column, a set of columns and even a condition. For example:

SELECT * FROM world.City JOIN world.Country ON (City.CountryCode = Country.Code) WHERE ...

USING is useful when both tables share a column of the exact same name on which they join. In this case, one may say:

SELECT ... FROM film JOIN film_actor USING (film_id) WHERE ...

An additional nice treat is that one does not need to fully qualify the joining columns:

SELECT film.title, film_id -- film_id is not prefixed
FROM film
JOIN film_actor USING (film_id)
WHERE ...

To illustrate, to do the above with ON, we would have to write:

SELECT film.title, film.film_id -- film.film_id is required here
FROM film
JOIN film_actor ON (film.film_id = film_actor.film_id)
WHERE ...

Notice the film.film_id qualification in the SELECT clause. It would be invalid to just say film_id since that would make for an ambiguity:

ERROR 1052 (23000): Column 'film_id' in field list is ambiguous

As for select *, the joining column appears in the result set twice with ON while it appears only once with USING:

mysql> create table t(i int);insert t select 1;create table t2 select*from t;
Query OK, 0 rows affected (0.11 sec)

Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0

Query OK, 1 row affected (0.19 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql> select*from t join t2 on t.i=t2.i;
+------+------+
| i | i |
+------+------+
| 1 | 1 |
+------+------+
1 row in set (0.00 sec)

mysql> select*from t join t2 using(i);
+------+
| i |
+------+
| 1 |
+------+
1 row in set (0.00 sec)

mysql>

In SQL / MySQL, what is the difference between ON and WHERE in a join statement?

WHERE is a part of the SELECT query as a whole, ON is a part of each individual join.

ON can only refer to the fields of previously used tables.

When there is no actual match against a record in the left table, LEFT JOIN returns one record from the right table with all fields set to NULLS. WHERE clause then evaluates and filter this.

In your query, only the records from gifts without match in 'sentgifts' are returned.

Here's the example

gifts

1 Teddy bear
2 Flowers

sentgifts

1 Alice
1 Bob

---
SELECT *
FROM gifts g
LEFT JOIN
sentgifts sg
ON g.giftID = sg.giftID

---

1 Teddy bear 1 Alice
1 Teddy bear 1 Bob
2 Flowers NULL NULL -- no match in sentgifts

---
SELECT *
FROM gifts g
LEFT JOIN
sentgifts sg
ON g.giftID = sg.giftID
WHERE sg.giftID IS NULL

---

2 Flowers NULL NULL -- no match in sentgifts

As you can see, no actual match can leave a NULL in sentgifts.id, so only the gifts that had not ever been sent are returned.

is there any difference between where clause and using association object for performance

They do the same thing. If you run the query in the console, you will see something like this

Order Load (0.3ms) SELECT "orders".* from "orders" WHERE "orders"."id" = $1 LIMIT $2, [["ID", 1], "LIMIT", 1]
Product Load (0.3ms) SELECT "products".* from "products" WHERE "products"."order_id" = $1 [["ORDER_ID", 1]]

Inner Join vs Natural Join vs USING clause: are there any advantages?

Now, apart from the fact that the first form has a duplicated column, is there a real advantage to the other two forms? Or are they just syntactic sugar?

TL;DR NATURAL JOIN is used in a certain style of relational programming that is simpler than the usual SQL style. (Although when embedded in SQL it is burdened with the rest of SQL query syntax.) That's because 1. it directly uses the simple operators of predicate logic, the language of precision in engineering (including software engineering), science (including computer science) and mathematics, and moreover 2. simultaneously and alternatively it directly uses the simple operators of relational algebra.

The common complaint about NATURAL JOIN is that since shared columns aren't explicit, after a schema change inappropriate column pairing may occur. And that may be the case in a particular development environment. But in that case there was a requirement that only certain columns be joined and NATURAL JOIN without PROJECT was not appropriate. So these arguments assume that NATURAL JOIN is being used inappropriately. Moreover the arguers aren't even aware that they are ignoring requirements. Such complaints are specious. (Moreover, sound software engineering design principles lead to not having interfaces with such specificiatons.)

Another related misconceived specious complaint from the same camp is that "NATURAL JOIN does not even take foreign key relationships into account". But any join is there because of the table meanings, not the constraints. Constraints are not needed to query. If a constraint is added then a query remains correct. If a constraint is dropped then a query relying on it becomes wrong and must be changed to a phrasing that doesn't rely on it that wouldn't have had to change. This has nothing to do with NATURAL JOIN.


You have described the difference in effect: just one copy of each common column is returned.

From Is there any rule of thumb to construct SQL query from a human-readable description?:

It turns out that natural language expressions and logical expressions and relational algebra expressions and SQL expressions (a hybrid of the last two) correspond in a rather direct way.

Eg from Codd 1970:

The relation depicted is called component. [...] The meaning of component(x, y,z) is that part x is an immediate component (or subassembly) of part y, and z units of part x are needed to assemble one unit of part y.

From this answer:

Every base table has a statement template, aka predicate, parameterized by column names, by which we put a row in or leave it out.

Plugging a row into a predicate gives a statement aka proposition. The rows that make a true proposition go in a table and the rows that make a false proposition stay out. (So a table states the proposition of each present row and states NOT the proposition of each absent row.)

But every table expression value has a predicate per its expression. The relational model is designed so that if tables T and U hold rows where T(...) and U(...) (respectively) then:

  • T NATURAL JOIN U holds rows where T(...) AND U(...)
  • T WHEREcondition holds rows where T(...) AND condition
  • T UNION CORRESPONDING U holds rows where T(...) OR U(...)
  • T EXCEPT CORRESPONDING U holds rows where T(...) AND NOT U(...)
  • SELECT DISTINCTcolumns to keepFROM T holds rows where

    THERE EXISTS columns to drop SUCH THAT T(...)
  • etc

Whereas reasoning about SQL otherwise is... not "natural":

An SQL SELECT statement can be thought of algebraically as 1. implicitly RENAMEing each column C of a table with (possibly implicit) correlation name T to T.C, then 2. CROSS JOINing, then 3. RESTRICTing per INNER ON, then 4. RESTRICTing per WHERE, then 5. PROJECTing per SELECT, then 6. RENAMEing per SELECT, dropping T.s, then 7. implicitly RENAMEing to drop remaining T.s Between the T.-RENAMEings algebra operators can also be thought of as logic operators and table names as their predicates: T JOIN ... vs Employee T.EMPLOYEE has name T.NAME ... AND .... But conceptually inside a SELECT statement is a double-RENAME-inducing CROSS JOIN table with T.Cs for column names while outside tables have Cs for column names.

Alternatively an SQL SELECT statement can be thought of logically as 1. introducing FORSOME T IN E around the entire statement per correlation name T and base name or subquery E, then 2. referring to the value of quantified T by using T.C to refer to its C part, then 3. building result rows from T.Cs per FROM etc, then 4. naming the result row columns per the SELECT clause, then 4. leaving the scope of the FORSOMEs. Again the algebra operators are being thought of as logic operators and table names as their predicates. Again though, this conceptually has T.C inside SELECTs but C outside with correlation names coming and going.

These two SQL interpretations are nowhere near as straightforward as just using JOIN or AND, etc, interchangeably. (You don't have to agree that it's simpler, but that perception is why NATURAL JOIN and UNION/EXCEPT CORRESPONDING are there.) (Arguments criticizing this style outside the context of its intended use are specious.)

USING is a kind of middle ground orphan with one foot in the NATURAL JOIN camp and one in the CROSS JOIN. It has no real role in the former because there are no duplicate column names there. In the latter it is more or less just abbreviating JOIN conditions and SELECT clauses.

I can see the disadvantage in the latter forms is that you are expected to have named your primary and foreign keys the same, which is not always practical.

PKs (primary keys), FKs (foreign keys) & other constraints are not needed for querying. (Knowing a column is a function of others allows scalar subqueries, but you can always phrase without.) Moreover any two tables can be meaningfully joined. If you need two columns to have the same name with NATURAL JOIN you rename via SELECT AS.



Related Topics



Leave a reply



Submit