Performance of Inner Join Compared to Cross Join

Performance of inner join compared to cross join

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.

(i) Vendor Specific Performance Guidelines

  1. MySQL Performance Tuning and Optimization Resources
  2. PostgreSQL Performance Optimization

CROSS JOIN vs INNER JOIN in SQL

Cross join does not combine the rows, if you have 100 rows in each table with 1 to 1 match, you get 10.000 results, Innerjoin will only return 100 rows in the same situation.

These 2 examples will return the same result:

Cross join

select * from table1 cross join table2 where table1.id = table2.fk_id

Inner join

select * from table1 join table2 on table1.id = table2.fk_id

Use the last method

SQL Cross Join better in performance than normal join?

The two queries aren't equivalent, because:

SELECT lastname, date
FROM customer, transact
WHERE quantity > 1000

Doesn't actually limit to customers that bought > 1000, it's simply taking every combination of rows from those two tables, and excluding any with quantity less than or equal to 1000 (all customers will be returned).

This query is equivalent to your JOIN version:

SELECT lastname, date
FROM customer c, transact t
WHERE quantity > 1000
AND c.customerid = t.customerid

The explicit JOIN version is preferred as it's not deprecated syntax, but both should have the same execution plan and identical performance. The explicit JOIN version is easier to read in my opinion, but the fact that the comma listed/implicit method has been outdated for over a decade (two?) should be enough reason to avoid it.

Benefits of INNER JOIN over CROSS APPLY

You can always use a CROSS APPLY where you'd use an INNER JOIN. But there are reasons you might (and often will) prefer INNER JOIN.

In case the two are equivalent the SQL Server optimizer does not treat them differently in my experience. Therefore, I do not follow the suggestion that a CROSS APPLY is faster. If apples are compared to apples the performance is, in all the query plans I have seen, identical.

INNER JOIN is more convenient to write. Also, it is idiomatic. Therefore, it is most legible and maintainable. INNER JOIN also is more widely supported although that probably does not matter on SQL Server. I also estimate that many developers simply do not know CROSS APPLY.

What's the difference between a cross join and an inner join with identical filter?

These two queries are functionally identical, as is the following query:

select c1, c2, c3
from t1, t2
where t1.f1 = t2.f2

What follows is my personal opinion:

Always write inner joins with the JOIN ... ON ... or JOIN ... USING (...) syntax.
The advantages are:

  1. It is immediately clear to the reader what you are doing and what the join condition is.

  2. You can never forget to write a join condition, because you are required to write one.

    This protects you from queries that return 1 billion rows instead of 10000 just because you forgot some join conditions, which is a frequent beginner's mistake.

Also note that while for inner joins it doesn't matter if you write a condition in the JOIN or in the WHERE clause, it matters for outer joins.

INNER JOIN vs CROSS JOIN vs CROSS APPLY

The first two are equivalent. Whether you use an inner join or cross join is really a matter of preference in this case. I think I would typically use the cross join, because there is no real join condition between the tables.

Note: You should never use cross join when the intention is a "real" inner join that has matching conditions between the tables.

The cross apply is not doing the same thing. It is only choosing one row. If your intention is to get at most one matching row, then use cross apply. If the intention is to get exactly one matching row, then use outer apply.

SQL Join Types and Performance: Cross vs Inner

Your first example is normally called an explicit join and the second one an implicit join. Performance-wise, they should be equivalent, at least in the popular DBMSes.



Related Topics



Leave a reply



Submit