How to Find Records That Are Not Joined

How do I find records that are not joined?

select * from a where id not in (select a_id from b)

Or like some other people on this thread says:

select a.* from a
left outer join b on a.id = b.a_id
where b.a_id is null

Finding all records that do NOT join on inner join of two tables?

You can use a FULL JOIN to combine the two tables, then use a WHERE clause to filter the results down to only non-matching rows by checking for a NULL in each tables primary key value.

Full outer join All rows in all joined tables are included, whether they are matched or not.

SELECT a.pk, b.pk
FROM tableA a
FULL JOIN tableB b ON a.pk=b.fk
WHERE
a.pk IS NULL
OR b.pk IS NULL

How to select all records from one table that do not exist in another table?

SELECT t1.name
FROM table1 t1
LEFT JOIN table2 t2 ON t2.name = t1.name
WHERE t2.name IS NULL

Q: What is happening here?

A: Conceptually, we select all rows from table1 and for each row we attempt to find a row in table2 with the same value for the name column. If there is no such row, we just leave the table2 portion of our result empty for that row. Then we constrain our selection by picking only those rows in the result where the matching row does not exist. Finally, We ignore all fields from our result except for the name column (the one we are sure that exists, from table1).

While it may not be the most performant method possible in all cases, it should work in basically every database engine ever that attempts to implement ANSI 92 SQL

More efficient sql to find records not in join table?

Queries can get slow when having many items in the IN clause. Use a left join instead

select u.*
from sms_users u
left join join_smsuser_campaigns c on u.id = c.sms_user_id
where c.sms_user_id is null

See this great join explanation

How to select rows with no matching entry in another table?

Here's a simple query:

SELECT t1.ID
FROM Table1 t1
LEFT JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL

The key points are:

  1. LEFT JOIN is used; this will return ALL rows from Table1, regardless of whether or not there is a matching row in Table2.

  2. The WHERE t2.ID IS NULL clause; this will restrict the results returned to only those rows where the ID returned from Table2 is null - in other words there is NO record in Table2 for that particular ID from Table1. Table2.ID will be returned as NULL for all records from Table1 where the ID is not matched in Table2.

Find records from one table which don't exist in another

There's several different ways of doing this, with varying efficiency, depending on how good your query optimiser is, and the relative size of your two tables:

This is the shortest statement, and may be quickest if your phone book is very short:

SELECT  *
FROM Call
WHERE phone_number NOT IN (SELECT phone_number FROM Phone_book)

alternatively (thanks to Alterlife)

SELECT *
FROM Call
WHERE NOT EXISTS
(SELECT *
FROM Phone_book
WHERE Phone_book.phone_number = Call.phone_number)

or (thanks to WOPR)

SELECT * 
FROM Call
LEFT OUTER JOIN Phone_Book
ON (Call.phone_number = Phone_book.phone_number)
WHERE Phone_book.phone_number IS NULL

(ignoring that, as others have said, it's normally best to select just the columns you want, not '*')

How to return rows from left table not found in right table?

If you are asking for T-SQL then lets look at fundamentals first. There are three types of joins here each with its own set of logical processing phases as:

  1. A cross join is simplest of all. It implements only one logical query processing phase, a Cartesian Product. This phase operates on the two tables provided as inputs to the join and produces a Cartesian product of the two. That is, each row from one input is matched with all rows from the other. So if you have m rows in one table and n rows in the other, you get m×n rows in the result.
  2. Then are Inner joins : They apply two logical query processing phases: A Cartesian product between the two input tables as in a cross join, and then it filters rows based on a predicate that you specify in ON clause (also known as Join condition).
  3. Next comes the third type of joins, Outer Joins:

    In an outer join, you mark a table as a preserved table by using the keywords LEFT OUTER JOIN, RIGHT OUTER JOIN, or FULL OUTER JOIN between the table names. The OUTER keyword is optional. The LEFT keyword means that the rows of the left table are preserved; the RIGHT keyword means that the rows in the right table are preserved; and the FULL keyword means that the rows in both the left and right tables are preserved.

    The third logical query processing phase of an outer join identifies the rows from the preserved table that did not find matches in the other table based on the ON predicate. This phase adds those rows to the result table produced by the first two phases of the join, and uses NULL marks as placeholders for the attributes from the nonpreserved side of the join in those outer rows.

Now if we look at the question: To return records from the left table which are not found in the right table use Left outer join and filter out the rows with NULL values for the attributes from the right side of the join.

How to select records not involved in a join

You're using conflicting terms in the where clause:

WHERE car.car_id = space.car_id AND space.car_id IS NULL

But space.car_id cannot both be null and match car.car_id.

You're looking for a left join:

SELECT  *
FROM car
LEFT JOIN
space
ON car.car_id = space.car_id
WHERE space.car_id IS NULL

Find records where join doesn't exist

Use an EXISTS expression:

WHERE NOT EXISTS (
SELECT FROM votes v -- SELECT list can be empty
WHERE v.some_id = base_table.some_id
AND v.user_id = ?
)

The difference

... between NOT EXISTS() (Ⓔ) and NOT IN() (Ⓘ) is twofold:

  1. Performance

    Ⓔ is generally faster. It stops processing the subquery as soon as the first match is found. The manual:

    The subquery will generally only be executed long enough to determine
    whether at least one row is returned, not all the way to completion.

    Ⓘ can also be optimized by the query planner, but to a lesser extent since NULL handling makes it more complex.

  2. Correctness

    If one of the resulting values in the subquery expression is NULL, the result of Ⓘ is NULL, while common logic would expect TRUE - and Ⓔ will return TRUE. The manual:

    If all the per-row results are either unequal or null, with at least
    one null, then the result of NOT IN is null.

Essentially, (NOT) EXISTS is the better choice in most cases.

Example

Your query can look like this:

SELECT *
FROM questions q
WHERE NOT EXISTS (
SELECT FROM votes v
WHERE v.question_id = q.id
AND v.user_id = ?
);

Do not join to votes in the base query. That would void the effort.

Besides NOT EXISTS and NOT IN there are additional syntax options with LEFT JOIN / IS NULL and EXCEPT. See:

  • Select rows which are not present in other table

Select rows which are not present in other table

There are basically 4 techniques for this task, all of them standard SQL.

NOT EXISTS

Often fastest in Postgres.

SELECT ip 
FROM login_log l
WHERE NOT EXISTS (
SELECT -- SELECT list mostly irrelevant; can just be empty in Postgres
FROM ip_location
WHERE ip = l.ip
);

Also consider:

  • What is easier to read in EXISTS subqueries?

LEFT JOIN / IS NULL

Sometimes this is fastest. Often shortest. Often results in the same query plan as NOT EXISTS.

SELECT l.ip 
FROM login_log l
LEFT JOIN ip_location i USING (ip) -- short for: ON i.ip = l.ip
WHERE i.ip IS NULL;

EXCEPT

Short. Not as easily integrated in more complex queries.

SELECT ip 
FROM login_log

EXCEPT ALL -- "ALL" keeps duplicates and makes it faster
SELECT ip
FROM ip_location;

Note that (per documentation):

duplicates are eliminated unless EXCEPT ALL is used.

Typically, you'll want the ALL keyword. If you don't care, still use it because it makes the query faster.

NOT IN

Only good without NULL values or if you know to handle NULL properly. I would not use it for this purpose. Also, performance can deteriorate with bigger tables.

SELECT ip 
FROM login_log
WHERE ip NOT IN (
SELECT DISTINCT ip -- DISTINCT is optional
FROM ip_location
);

NOT IN carries a "trap" for NULL values on either side:

  • Find records where join doesn't exist

Similar question on dba.SE targeted at MySQL:

  • Select rows where value of second column is not present in first column


Related Topics



Leave a reply



Submit