What's the difference between using and on in table joins in MySQL?
I don't use the USING syntax, since
- most of my joins aren't suited to it (not the same fieldname that is being matched, and/or multiple matches in the join) and
- it isn't immediately obvious what it translates to in the case with more than two tables
ie assuming 3 tables with 'id' and 'id_2' columns, does
T1 JOIN T2 USING(id) JOIN T3 USING(id_2)
become
T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T1.id_2=T3.id_2 AND T2.id_2=T3.id_2)
or
T1 JOIN T2 ON(T1.id=T2.id) JOIN T3 ON(T2.id_2=T3.id_2)
or something else again?
Finding this out for a particular database version is a fairly trivial exercise, but I don't have a large amount of confidence that it is consistent across all databases, and I'm not the only person that has to maintain my code (so the other people will also have to be aware of what it is equivalent to).
An obvious difference with the WHERE vs ON is if the join is outer:
Assuming a T1 with a single ID field, one row containing the value 1, and a T2 with an ID and VALUE field (one row, ID=1, VALUE=6), then we get:
SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID) WHERE T2.VALUE=42
gives no rows, since the WHERE is required to match, whereas
SELECT T1.ID, T2.ID, T2.VALUE FROM T1 LEFT OUTER JOIN T2 ON(T1.ID=T2.ID AND T2.VALUE=42)
will give one row with the values
1, NULL, NULL
since the ON is only required for matching the join, which is optional due to being outer.
SQL JOIN - WHERE clause vs. ON clause
They are not the same thing.
Consider these queries:
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345
and
SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345
The first will return an order and its lines, if any, for order number 12345
.
The second will return all orders, but only order 12345
will have any lines associated with it.
With an INNER JOIN
, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.
USING Keyword vs ON clause - MYSQL
The USING
clause is something we don't need to mention in the JOIN
condition when we are retrieving data from multiple tables. When we use a USING
clause, that particular column name should be present in both tables, and the SELECT
query will automatically join those tables using the given column name in the USING
clause.
For example, if there are two common column names in the table, then mention the desired common column name in the USING
clause.
USING
is also used while executing Dynamic SQL, like so:
EXECUTE IMMEDIATE 'DELETE FROM dept WHERE deptno = :num'
USING dept_id;
The
USING
clause: This allows you to specify the join key by name.The
ON
clause: This syntax allows you to specify the column names for join keys in both tables.
The USING clause
The
USING
clause is used if several columns share the same name but you don’t want to join using all of these common columns. The columns listed in the USING clause can’t have any qualifiers in the statement, including the WHERE clause.
The ON clause
The
ON
clause is used to join tables where the column names don’t match in both tables. The join conditions are removed from the filter conditions in the WHERE clause.
What's the difference between where clause and on clause when table left join?
The where
clause applies to the whole resultset; the on clause
only applies to the join in question.
In the example supplied, all of the additional conditions related to fields on the inner side of the join - so in this example, the two queries are effectively identical.
However, if you had included a condition on a value in the table in the outer side of the join, it would have made a significant difference.
You can get more from this link: http://ask.sqlservercentral.com/questions/80067/sql-data-filter-condition-in-join-vs-where-clause
For example:
select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 and t2.f4=1
select t1.f1,t2.f2 from t1 left join t2 on t1.f1 = t2.f2 where t2.f4=1
- do different things - the former will left join to t2 records where f4 is 1, while the latter has effectively been turned back into an inner join to t2.
Difference between USING and ON in Oracle SQL
The difference for me is that you can paint yourself into a corner with the USING clause:
CREATE TABLE roster (mgrid INTEGER, empid INTEGER);
CREATE TABLE emp (empid INTEGER, NAME VARCHAR2(20));
INSERT INTO roster VALUES (1,10);
INSERT INTO roster VALUES (1,11);
INSERT INTO roster VALUES (1,12);
INSERT INTO roster VALUES (2,20);
INSERT INTO roster VALUES (2,21);
INSERT INTO emp VALUES (10, 'John');
INSERT INTO emp VALUES (11, 'Steve');
INSERT INTO emp VALUES (12, 'Mary');
INSERT INTO emp VALUES (20, 'Ann');
INSERT INTO emp VALUES (21, 'George');
INSERT INTO emp VALUES (1, 'Pete');
INSERT INTO emp VALUES (2, 'Sally');
SELECT r.mgrid, e2.name, e1.empid, e1.name
FROM roster r JOIN emp e1 USING(empid)
JOIN emp e2 ON r.mgrid = e2.empid;
In the above select, you get an ora-25154, "column part of USING clause cannot have a qualifier".
If you remove the e1.empid qualifier, as in:
SELECT r.mgrid, e2.name, empid, e1.name
FROM roster r JOIN emp e1 USING(empid)
JOIN emp e2 ON r.mgrid = e2.empid;
You get an ORA-00918 error, "column ambiguously defined".
You have to use:
SELECT r.mgrid, e2.name, e1.empid, e1.name
FROM roster r JOIN emp e1 ON r.empid = e1.empid
JOIN emp e2 ON r.mgrid = e2.empid;
The example is contrived, but when I was first exploring the join syntax I ran into this exact problem in a real situation. I have avoided the USING clause ever since. There is no advantage with the USING clause other than a few keystrokes.
MySQL JOIN ON vs USING?
It is mostly syntactic sugar, but a couple differences are noteworthy:
ON is the more general of the two. One can join tables ON a column, a set of columns and even a condition. For example:
SELECT * FROM world.City JOIN world.Country ON (City.CountryCode = Country.Code) WHERE ...
USING is useful when both tables share a column of the exact same name on which they join. In this case, one may say:
SELECT ... FROM film JOIN film_actor USING (film_id) WHERE ...
An additional nice treat is that one does not need to fully qualify the joining columns:
SELECT film.title, film_id -- film_id is not prefixed
FROM film
JOIN film_actor USING (film_id)
WHERE ...
To illustrate, to do the above with ON, we would have to write:
SELECT film.title, film.film_id -- film.film_id is required here
FROM film
JOIN film_actor ON (film.film_id = film_actor.film_id)
WHERE ...
Notice the film.film_id
qualification in the SELECT
clause. It would be invalid to just say film_id
since that would make for an ambiguity:
ERROR 1052 (23000): Column 'film_id' in field list is ambiguous
As for select *
, the joining column appears in the result set twice with ON
while it appears only once with USING
:
mysql> create table t(i int);insert t select 1;create table t2 select*from t;
Query OK, 0 rows affected (0.11 sec)
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
Query OK, 1 row affected (0.19 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> select*from t join t2 on t.i=t2.i;
+------+------+
| i | i |
+------+------+
| 1 | 1 |
+------+------+
1 row in set (0.00 sec)
mysql> select*from t join t2 using(i);
+------+
| i |
+------+
| 1 |
+------+
1 row in set (0.00 sec)
mysql>
In SQL / MySQL, what is the difference between ON and WHERE in a join statement?
WHERE
is a part of the SELECT
query as a whole, ON
is a part of each individual join.
ON
can only refer to the fields of previously used tables.
When there is no actual match against a record in the left table, LEFT JOIN
returns one record from the right table with all fields set to NULLS
. WHERE
clause then evaluates and filter this.
In your query, only the records from gifts
without match in 'sentgifts' are returned.
Here's the example
gifts
1 Teddy bear
2 Flowers
sentgifts
1 Alice
1 Bob
---
SELECT *
FROM gifts g
LEFT JOIN
sentgifts sg
ON g.giftID = sg.giftID
---
1 Teddy bear 1 Alice
1 Teddy bear 1 Bob
2 Flowers NULL NULL -- no match in sentgifts
---
SELECT *
FROM gifts g
LEFT JOIN
sentgifts sg
ON g.giftID = sg.giftID
WHERE sg.giftID IS NULL
---
2 Flowers NULL NULL -- no match in sentgifts
As you can see, no actual match can leave a NULL
in sentgifts.id
, so only the gifts that had not ever been sent are returned.
is there any difference between where clause and using association object for performance
They do the same thing. If you run the query in the console, you will see something like this
Order Load (0.3ms) SELECT "orders".* from "orders" WHERE "orders"."id" = $1 LIMIT $2, [["ID", 1], "LIMIT", 1]
Product Load (0.3ms) SELECT "products".* from "products" WHERE "products"."order_id" = $1 [["ORDER_ID", 1]]
Inner Join vs Natural Join vs USING clause: are there any advantages?
Now, apart from the fact that the first form has a duplicated column, is there a real advantage to the other two forms? Or are they just syntactic sugar?
TL;DR NATURAL JOIN is used in a certain style of relational programming that is simpler than the usual SQL style. (Although when embedded in SQL it is burdened with the rest of SQL query syntax.) That's because 1. it directly uses the simple operators of predicate logic, the language of precision in engineering (including software engineering), science (including computer science) and mathematics, and moreover 2. simultaneously and alternatively it directly uses the simple operators of relational algebra.
The common complaint about NATURAL JOIN is that since shared columns aren't explicit, after a schema change inappropriate column pairing may occur. And that may be the case in a particular development environment. But in that case there was a requirement that only certain columns be joined and NATURAL JOIN without PROJECT was not appropriate. So these arguments assume that NATURAL JOIN is being used inappropriately. Moreover the arguers aren't even aware that they are ignoring requirements. Such complaints are specious. (Moreover, sound software engineering design principles lead to not having interfaces with such specificiatons.)
Another related misconceived specious complaint from the same camp is that "NATURAL JOIN does not even take foreign key relationships into account". But any join is there because of the table meanings, not the constraints. Constraints are not needed to query. If a constraint is added then a query remains correct. If a constraint is dropped then a query relying on it becomes wrong and must be changed to a phrasing that doesn't rely on it that wouldn't have had to change. This has nothing to do with NATURAL JOIN.
You have described the difference in effect: just one copy of each common column is returned.
From Is there any rule of thumb to construct SQL query from a human-readable description?:
It turns out that natural language expressions and logical expressions and relational algebra expressions and SQL expressions (a hybrid of the last two) correspond in a rather direct way.
Eg from Codd 1970:
The relation depicted is called component. [...] The meaning of component(x, y,z) is that part x is an immediate component (or subassembly) of part y, and z units of part x are needed to assemble one unit of part y.
From this answer:
Every base table has a statement template, aka predicate, parameterized by column names, by which we put a row in or leave it out.
Plugging a row into a predicate gives a statement aka proposition. The rows that make a true proposition go in a table and the rows that make a false proposition stay out. (So a table states the proposition of each present row and states NOT the proposition of each absent row.)
But every table expression value has a predicate per its expression. The relational model is designed so that if tables
T
andU
hold rows where T(...) and U(...) (respectively) then:
T NATURAL JOIN U
holds rows where T(...) AND U(...)T WHERE
condition
holds rows where T(...) AND conditionT UNION CORRESPONDING U
holds rows where T(...) OR U(...)T EXCEPT CORRESPONDING U
holds rows where T(...) AND NOT U(...)SELECT DISTINCT
columns to keep
FROM T
holds rows where
THERE EXISTS columns to drop SUCH THAT T(...)- etc
Whereas reasoning about SQL otherwise is... not "natural":
An SQL SELECT statement can be thought of algebraically as 1. implicitly RENAMEing each column C
of a table with (possibly implicit) correlation name T
to T.C
, then 2. CROSS JOINing, then 3. RESTRICTing per INNER ON, then 4. RESTRICTing per WHERE, then 5. PROJECTing per SELECT, then 6. RENAMEing per SELECT, dropping T.
s, then 7. implicitly RENAMEing to drop remaining T.
s Between the T.
-RENAMEings algebra operators can also be thought of as logic operators and table names as their predicates: T JOIN ...
vs Employee T.EMPLOYEE has name T.NAME ... AND ...
. But conceptually inside a SELECT statement is a double-RENAME-inducing CROSS JOIN table with T.C
s for column names while outside tables have C
s for column names.
Alternatively an SQL SELECT statement can be thought of logically as 1. introducing FORSOME T IN E
around the entire statement per correlation name T
and base name or subquery E
, then 2. referring to the value of quantified T
by using T.C
to refer to its C
part, then 3. building result rows from T.C
s per FROM etc, then 4. naming the result row columns per the SELECT clause, then 4. leaving the scope of the FORSOME
s. Again the algebra operators are being thought of as logic operators and table names as their predicates. Again though, this conceptually has T.C
inside SELECTs but C
outside with correlation names coming and going.
These two SQL interpretations are nowhere near as straightforward as just using JOIN or AND, etc, interchangeably. (You don't have to agree that it's simpler, but that perception is why NATURAL JOIN and UNION/EXCEPT CORRESPONDING are there.) (Arguments criticizing this style outside the context of its intended use are specious.)
USING is a kind of middle ground orphan with one foot in the NATURAL JOIN camp and one in the CROSS JOIN. It has no real role in the former because there are no duplicate column names there. In the latter it is more or less just abbreviating JOIN conditions and SELECT clauses.
I can see the disadvantage in the latter forms is that you are expected to have named your primary and foreign keys the same, which is not always practical.
PKs (primary keys), FKs (foreign keys) & other constraints are not needed for querying. (Knowing a column is a function of others allows scalar subqueries, but you can always phrase without.) Moreover any two tables can be meaningfully joined. If you need two columns to have the same name with NATURAL JOIN you rename via SELECT AS.
Related Topics
Tsql Left Join and Only Last Row from Right
Prompt for Parameters with SQL Management Studio
Sql-How to Insert Row Without Auto Incrementing a Id Column
Using Union and Count(*) Together in SQL Query
Get "Time with Time Zone" from "Time Without Time Zone" and the Time Zone Name
Remove Duplicates from SQL Union
How to Add a Column to Large SQL Server Table
Using the Web.Config to Set Up My SQL Database Connection String
What Is the Big-O for SQL Select
How to Check If Value Exists in Each Group (After Group By)
Efficient Implementation of Faceted Search in Relational Databases
How to Get a Hash of an Entire Table in Postgresql
Insert Multiple Values Using Insert into (SQL Server 2005)
SQL Query Selecting Different Row Result in JSON_Modify Because of in Operator Provided Value
How to Clear All Cached Items in Oracle