Is It Better to Do an Equi Join in the from Clause or Where Clause

Is it better to do an equi join in the from clause or where clause

It's a style matter. Generally, you'd want to put the conditions that define the "shape" of the result set in the FROM clause (i.e. those that control which rows from each table should join together to produce a result), whereas those conditions which filter the result set should be in the WHERE clause. For INNER JOINs, the effects are equal, but once OUTER JOINs (LEFT, RIGHT) are involved, it feels a lot clearer.


In your first example, I'm left asking "what has this got to do with Table B?" when I encounter this odd condition in the JOIN. Whereas in the second, I can skip over the FROM clause (and all JOINs) if I'm not interested, and just see the conditions which determine whether rows are going to be returned in the WHERE clause.

What is a valid use case for using a Non-Equi Join? , =, , =,

Back in the day, SQL 2005 and earlier it used to be depending on how SQL Server was maintained it could be claimed that it was slightly faster at times. I got used to doing it this way as it made logical sense to me to limit scope faster and go for the biggest tables first and get more bang for the buck.

EG: Say I have three tables A, B, C. And A and B had MILLIONS of rows and some indexes on a Dt(Date) field. And the other table had only a few ten thousand rows. I would a lot of times do something like this:

Select (columns)
From a
inner join b on a.Id = b.FId
and a.Id >= (somedate)
inner join c on b.Id = c.FId

It generally to me made more sense to limit scope as soon as possible and in terms of the engine the 'From' statement actually comes first in a SQL Server engine from what I have read and seen. So I was really taking a set of saying all the potentials of millions upon millions THEN doing a where statement to just knowing that an inner join is always saying the requirements MUST match to return and limiting scope further. The 'Where' clause does do the same thing but is evaluated AFTER the 'From' statement so it was reasonable to conclude it would be slower.

However there is the constant debate in dev circles of performance versus readability. So if I had something like:

Select (columns)
From a
inner join b on a.Id = b.FId
and a.Id >= (somedate)
and a.ocol = (criteria)
left outer join c on b.Id = c.FId
where c.ocol = (criteria)

Someone could tell me: "Hey man you are only getting a performance of 0.00001 boost from that, how about just putting it all in the Where clause?" It is sometimes a balancing act for performance versus readability. If something is highly lagging though I could rightfully say it may be better a certain way. However in general I read that around 2012, or maybe 2008 R2, or so Microsoft reworked the engine that it compiles more efficient anyways that essentially it no longer really saves time. You can test it yourself though if you want:

Run this on SQL Management Studio:

SET STATISTICS TIME ON;

And you will see things like this:

SQL Server parse and compile time: 
CPU time = 0 ms, elapsed time = 2 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 2 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 8 ms.

On the Messages tab. You can also of course do the more heavy handed 'Client Statistics' tab from the view panel and see even more detail. Suffice to say it is just a syntactic trick employed by many to make more efficient use of the engine execution to limit scope faster. However the reworking may not make it any better any more. I still use it though when coding on my own and you get used to things :)

Inner join vs Where

No! The same execution plan, look at these two tables:

CREATE TABLE table1 (
id INT,
name VARCHAR(20)
);

CREATE TABLE table2 (
id INT,
name VARCHAR(20)
);

The execution plan for the query using the inner join:

-- with inner join

EXPLAIN PLAN FOR
SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.id;

SELECT *
FROM TABLE (DBMS_XPLAN.DISPLAY);

-- 0 select statement
-- 1 hash join (access("T1"."ID"="T2"."ID"))
-- 2 table access full table1
-- 3 table access full table2

And the execution plan for the query using a WHERE clause.

-- with where clause

EXPLAIN PLAN FOR
SELECT * FROM table1 t1, table2 t2
WHERE t1.id = t2.id;

SELECT *
FROM TABLE (DBMS_XPLAN.DISPLAY);

-- 0 select statement
-- 1 hash join (access("T1"."ID"="T2"."ID"))
-- 2 table access full table1
-- 3 table access full table2

Difference between Equi-Join and Inner-Join in SQL

Try this:

SELECT a.First_Name, b.Dept_Name, alt.Min_Salary AS Min_Salary
FROM table123 a
INNER JOIN table246 b
ON a.Dept_ID = b.Dept_ID
INNER JOIN (
SELECT Dept_ID, MIN(Salary) Min_Salary
FROM table123
GROUP BY Dept_ID
) alt
ON b.Dept_ID = alt.Dept_ID
WHERE a.Salary = alt.Min_Salary;

Inner Join, Natural Joins and Equi Join

Inner join of A and B combines columns of a row from A and a row from B based on a join predicate. For example, a "sempai" join: SELECT ... FROM people A INNER JOIN people B ON A.age > B.age will pair each person with each person that is their junior; the juniormost people will not be selected from A, and seniormost people will not be selected from B, because there are no matching rows.

Equi join is a particular join where the join relation is equality. A "sempai" join from the last paragraph is not an equi join; but "same age" join would be. Though typically it would be used for foreign relationships (equi joins on primary keys), such as SELECT ... FROM person A INNER JOIN bicycle B ON A.bicycle_id = B.id. (Pay no attention to the fact that this is not a proper model, people sometimes have multiple bicycles... a bit of a silly example, I'm sure I could have found a better one.)

A natural join is a special kind of equi join that assumes equality of all shared columns (without explicitly stating the predicate). So for example SELECT ... FROM people A INNER JOIN bicycles B ON A.bicycle_id = B.bicycle_id is equivalent to SELECT ... FROM people A NATURAL JOIN bicycles B, assuming bicycle_id is the only column present in both tables. Most people I know will not use this, because of several reasons - it is a more common practice to have the primary key not repeat the table name, i.e. bicycles.id than bicycles.bicycles_id; it is possible the foreign key does not reflect the table name (e.g. person.overseer_id rather than person.person_id, for obvious reasons), and (forgotten my me but thankfully remembered by Sudipta Mondal) there might be unrelated columns that are named the same but make zero sense to join on, like creation_time. For these reasons, I have never used NATURAL JOIN in my life.

Equi/natural joins do not necessarily have to be inner.



Related Topics



Leave a reply



Submit