Condition Within Join or Where

Condition within JOIN or WHERE

The relational algebra allows interchangeability of the predicates in the WHERE clause and the INNER JOIN, so even INNER JOIN queries with WHERE clauses can have the predicates rearrranged by the optimizer so that they may already be excluded during the JOIN process.

I recommend you write the queries in the most readable way possible.

Sometimes this includes making the INNER JOIN relatively "incomplete" and putting some of the criteria in the WHERE simply to make the lists of filtering criteria more easily maintainable.

For example, instead of:

SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
AND c.State = 'NY'
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
AND a.Status = 1

Write:

SELECT *
FROM Customers c
INNER JOIN CustomerAccounts ca
ON ca.CustomerID = c.CustomerID
INNER JOIN Accounts a
ON ca.AccountID = a.AccountID
WHERE c.State = 'NY'
AND a.Status = 1

But it depends, of course.

INNER JOIN condition in WHERE clause or ON clause?

For inner joins like this they are logically equivalent. However, you can run in to situations where a condition in the join clause means something different than a condition in the where clause.

As a simple illustration, imagine you do a left join like so;

select x.id
from x
left join y
on x.id = y.id
;

Here we're taking all the rows from x, regardless of whether there is a matching id in y. Now let's say our join condition grows - we're not just looking for matches in y based on the id but also on id_type.

select x.id
from x
left join y
on x.id = y.id
and y.id_type = 'some type'
;

Again this gives all the rows in x regardless of whether there is a matching (id, id_type) in y.

This is very different, though:

select x.id
from x
left join y
on x.id = y.id
where y.id_type = 'some type'
;

In this situation, we're picking all the rows of x and trying to match to rows from y. Now for rows for which there is no match in y, y.id_type will be null. Because of that, y.id_type = 'some type' isn't satisfied, so those rows where there is no match are discarded, which effectively turned this in to an inner join.

Long story short: for inner joins it doesn't matter where the conditions go but for outer joins it can.

SQL JOIN - WHERE clause vs. ON clause

They are not the same thing.

Consider these queries:

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
WHERE Orders.ID = 12345

and

SELECT *
FROM Orders
LEFT JOIN OrderLines ON OrderLines.OrderID=Orders.ID
AND Orders.ID = 12345

The first will return an order and its lines, if any, for order number 12345. The second will return all orders, but only order 12345 will have any lines associated with it.

With an INNER JOIN, the clauses are effectively equivalent. However, just because they are functionally the same, in that they produce the same results, does not mean the two kinds of clauses have the same semantic meaning.

inner join on condition or using where?

SQL is not a procedural language. It is a descriptive language. A query describes the result set that you want to produce.

With an inner join, the two queries in your question are identical -- they produce the same result set under all circumstances. Which to prefer is a stylistic preference. MySQL should treat the two the same way from an optimization perspective.

One preference is that filters on a single table are more appropriate for WHERE and ON.

With an outer join, the two queries are not the same, and you should use the one that expresses your intent.

SQL LEFT JOIN: difference between WHERE and condition inside AND

with a left join there is a difference

with condition on left join rows with column > 10 will be there filled with nulls

with where condition rows will be filtered out

with a inner join there is no difference

example:

declare @t table (id int, dummy varchar(20))
declare @a table (id int, age int, col int)

insert into @t
select * from (
values
(1, 'pippo' ),
(2, 'pluto' ),
(3, 'paperino' ),
(4, 'ciccio' ),
(5, 'caio' ),
(5, 'sempronio')
) x (c1,c2)

insert into @a
select * from (
values
(1, 38, 2 ),
(2, 26, 5 ),
(3, 41, 12),
(4, 15, 11),
(5, 39, 7 )
) x (c1,c2,c3)

select t.*, a.age
from @t t
left join @a a on t.ID = a.ID and a.col > 10

Outputs:

id  dummy       age
1 pippo NULL
2 pluto NULL
3 paperino 41
4 ciccio 15
5 caio NULL
5 sempronio NULL

While

select t.*, a.age
from @t t
left join @a a on t.ID = a.ID
where a.col > 10

Outputs:

id  dummy       age
3 paperino 41
4 ciccio 15

So with LEFT JOIN you will get ALWAYS all the rows from 1st table

If the join condition is true, you will get columns from joined table filled with their values, if the condition is false their columns will be NULL

With WHERE condition you will get only the rows that match the condition.

left join and where condition in joining condition

You should not use column related to left table in where condition (this work as a INNER JOIN) move the condition for left join in the related ON clause

 select *  
FROM table1 t1
left join table2 t2
ON t1.id = t2.fk_id AND t2.id_number = 12174
WHERE t1.code = 'CODE1' ;

The where condition is the equivalent part of the INNER JOIN clause this is the reason that you have this behavior..

adding the condition to the on clause mean that also the added condition work as an outer join ..

SQL left join with filter in JOIN condition vs filter in WHERE clause

The big difference with the Where condition b.status is null or b.status in (10, 100)
is when b.status is say 1 as well as b.id=a.id

In the first query you will still get the row from table A with corresponding B part as NULL as On condition is not fully satisfied.
In the second query you will get the row in the JOIN for both a and b tables which will be lost in the where clause.

Which performs first WHERE clause or JOIN clause

The conceptual order of query processing is:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY

But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:

CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))


;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))

INSERT INTO test1(name) SELECT 'a' FROM cte

Now run 2 queries:

SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1


SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1

Notice that the first query will filter most rows out in the join condition, but the second query filters in the where condition. Look at the produced plans:

1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)

2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)

This means that in the first query optimized, the engine decided first to evaluate the join condition to filter out rows. In the second query, it evaluated the where clause first.



Related Topics



Leave a reply



Submit