Filter Table Before Applying Left Join

Filter Table Before Applying Left Join

You need to move the WHERE filter to the JOIN condition:

SELECT c.Customer, c.State, e.Entry
FROM Customer c
LEFT JOIN Entry e
ON c.Customer=e.Customer
AND e.Category='D'

See SQL Fiddle with Demo

How to left join on two tables that need filtering?

If you know you want all the records from Table1 first, then you can add your filtering conditions to the JOIN's ON statement for table2, or JOIN on a derived table2.

INNER JOIN filter example:

SELECT
*
FROM Table1 AS t1
INNER JOIN Table2 AS t2
ON t1.SupplierId = t2.SupplierId
AND t2.Column1 = ...condition
AND t2.Column2 = ...condition
WHERE
...Table1 filters

INNER JOIN on a derived table:

SELECT
*
FROM Table1 AS t1
INNER JOIN (
SELECT * FROM Table2 WHERE ...Table2 filters
) AS t2
ON t1.SupplierId = t2.SupplierId
WHERE
...Table1 filters

To return all rows from Table1 regardless if a match exists in Table2, then you can perform a LEFT JOIN instead.

LEFT JOIN filter example:

SELECT
*
FROM Table1 AS t1
LEFT JOIN Table2 AS t2
ON t1.SupplierId = t2.SupplierId
AND t2.Column1 = ...condition
AND t2.Column2 = ...condition
WHERE
...Table1 filters

LEFT JOIN on a derived table:

SELECT
*
FROM Table1 AS t1
LEFT JOIN (
SELECT * FROM Table2 WHERE ...Table2 filters
) AS t2
ON t1.SupplierId = t2.SupplierId
WHERE
...Table1 filters

UPDATE:

I just want a column in table 1 showing essentially a match or a NULL.

/*
Show all rows in [2020_supplier1] where SetID = 'ID'
and a count of matching records from [2020supplier2]
*/

SELECT DISTINCT
t1.[Supplier Name],
n.Table2Count -- will be NULL without a match.
FROM [2020_supplier1] AS t1
OUTER APPLY (

SELECT COUNT ( * ) AS Table2Count FROM [2020supplier2] AS t2
WHERE
t2.[Vendor Name] = t1.[Supplier Name]
AND t2.[Engagement Status] = 'your condition'
-- any additional filters...

) AS n
WHERE
t1.[SetID] = 'ID';

You can simplify this even further and save processing time if you don't care about a [2020supplier2] count with the following:

SELECT DISTINCT
t1.[Supplier Name],
n.VendorMatch
FROM [2020_supplier1] AS t1
OUTER APPLY (

SELECT CASE
WHEN EXISTS (

SELECT * FROM [2020supplier2] AS t2 WHERE
t2.[Vendor Name] = t1.[Supplier Name]
AND t2.[Engagement Status] = 'your condition'
-- any additional filters...

) THEN 1
ELSE 0
END AS VendorMatch

) AS n
WHERE
t1.[SetID] = 'ID';

SQL filter LEFT TABLE before left join

A left join follows a simple rule. It keeps all the rows in the first table. The values of columns depend on the on clause. If there is no match, then the corresponding table's columns are NULL -- whether the first or second table.

So, for this query:

select *
from tableA A left join
tableB B
on A.ID = B.ID and A.ID = 20;

All the rows in A are in the result set, regardless of whether or not there is a match. When the id is not 20, then the rows and columns are still taken from A. However, the condition is false so the columns in B are NULL. This is a simple rule. It does not depend on whether the conditions are on the first table or the second table.

For this query:

select *
from tableA A left join
tableB B
on A.ID = B.ID
where A.ID = 20;

The from clause keeps all the rows in A. But then the where clause has its effect. And it filters the rows so on only id 20s are in the result set.

When using a left join:

  • Filter conditions on the first table go in the where clause.
  • Filter conditions on subsequent tables go in the on clause.

LIMIT / Filtering on LEFT JOIN

Snowflake supports joining laterally. That is, on to a function or correlated sub-query. This allows you to join on to a query that returns just one row (per input row).

SELECT  
purchases.date
,purchases.revenue
,clicks.campaign_id
FROM
purchases
LEFT JOIN LATERAL
(
SELECT
campaign_id
,user_id
,click_time
FROM
campaign_clicks
WHERE
user_id = purchases.user_id
-- only select campaign clicks that occurred before the purchase
AND click_time < purchases.purchase_time
-- only include clicks that occurred within 3 days of the purchase
AND click_time >= DATEADD(days, -3, purchases.purchase_time)
ORDER BY
click_time DESC
LIMIT
1
)
AS clicks

SQL left join with filter in JOIN condition vs filter in WHERE clause

The big difference with the Where condition b.status is null or b.status in (10, 100)
is when b.status is say 1 as well as b.id=a.id

In the first query you will still get the row from table A with corresponding B part as NULL as On condition is not fully satisfied.
In the second query you will get the row in the JOIN for both a and b tables which will be lost in the where clause.

Inner join and left join gives same results with where filter and hence which join to use?

This happens when you apply the filter (where clause condition) on the table you're left joining on. In this case the 'Order' table.

It is because your WHERE clause explicitly filters rows from the Order table where the color is pink. It will then join on only the matching Order rows on the Customer table.

You'll see that when you remove the where clause, the left join will function as you expect. :)

Filtering columns from two diff tables after join

Second case will return you more rows where m.name will be null.

The thing is the first query joins tables and filters records afterwards.
Whereas in the second query you're filtering records from the m table out before joining them. Thus, there will be less records in m table to match with records in e

Here is an example. Employee table looks something like this



Leave a reply



Submit