Performance of inner join compared to cross join
Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.
With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.
If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.
As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.
(i) Vendor Specific Performance Guidelines
- MySQL Performance Tuning and Optimization Resources
- PostgreSQL Performance Optimization
CROSS JOIN vs INNER JOIN in SQL
Cross join does not combine the rows, if you have 100 rows in each table with 1 to 1 match, you get 10.000 results, Innerjoin will only return 100 rows in the same situation.
These 2 examples will return the same result:
Cross join
select * from table1 cross join table2 where table1.id = table2.fk_id
Inner join
select * from table1 join table2 on table1.id = table2.fk_id
Use the last method
SQL Cross Join better in performance than normal join?
The two queries aren't equivalent, because:
SELECT lastname, date
FROM customer, transact
WHERE quantity > 1000
Doesn't actually limit to customers that bought > 1000, it's simply taking every combination of rows from those two tables, and excluding any with quantity less than or equal to 1000 (all customers will be returned).
This query is equivalent to your JOIN
version:
SELECT lastname, date
FROM customer c, transact t
WHERE quantity > 1000
AND c.customerid = t.customerid
The explicit JOIN
version is preferred as it's not deprecated syntax, but both should have the same execution plan and identical performance. The explicit JOIN
version is easier to read in my opinion, but the fact that the comma listed/implicit method has been outdated for over a decade (two?) should be enough reason to avoid it.
Benefits of INNER JOIN over CROSS APPLY
You can always use a CROSS APPLY
where you'd use an INNER JOIN
. But there are reasons you might (and often will) prefer INNER JOIN
.
In case the two are equivalent the SQL Server optimizer does not treat them differently in my experience. Therefore, I do not follow the suggestion that a CROSS APPLY
is faster. If apples are compared to apples the performance is, in all the query plans I have seen, identical.
INNER JOIN
is more convenient to write. Also, it is idiomatic. Therefore, it is most legible and maintainable. INNER JOIN
also is more widely supported although that probably does not matter on SQL Server. I also estimate that many developers simply do not know CROSS APPLY
.
What's the difference between a cross join and an inner join with identical filter?
These two queries are functionally identical, as is the following query:
select c1, c2, c3
from t1, t2
where t1.f1 = t2.f2
What follows is my personal opinion:
Always write inner joins with the JOIN ... ON ...
or JOIN ... USING (...)
syntax.
The advantages are:
It is immediately clear to the reader what you are doing and what the join condition is.
You can never forget to write a join condition, because you are required to write one.
This protects you from queries that return 1 billion rows instead of 10000 just because you forgot some join conditions, which is a frequent beginner's mistake.
Also note that while for inner joins it doesn't matter if you write a condition in the JOIN
or in the WHERE
clause, it matters for outer joins.
INNER JOIN vs CROSS JOIN vs CROSS APPLY
The first two are equivalent. Whether you use an inner join or cross join is really a matter of preference in this case. I think I would typically use the cross join
, because there is no real join condition between the tables.
Note: You should never use cross join
when the intention is a "real" inner join
that has matching conditions between the tables.
The cross apply
is not doing the same thing. It is only choosing one row. If your intention is to get at most one matching row, then use cross apply
. If the intention is to get exactly one matching row, then use outer apply
.
SQL Join Types and Performance: Cross vs Inner
Your first example is normally called an explicit join and the second one an implicit join. Performance-wise, they should be equivalent, at least in the popular DBMSes.
Related Topics
How to Dump the Data of Some Sqlite3 Tables
Get Day of Week in SQL Server 2005/2008
How to Do a Case Sensitive Search in Where Clause (I'M Using SQL Server)
MySQL Delete from With Subquery as Condition
MySQL: Invalid Use of Group Function
Key Value Pairs in Relational Database
Declare Variable in Sqlite and Use It
How to Delete Duplicate Rows Without Unique Identifier
Dynamically Create Columns Sql
"Case" Statement Within "Where" Clause in SQL Server 2008
Sql: If Clause Within Where Clause
Foreign Key Referring to Primary Keys Across Multiple Tables
T-SQL Datetime Rounded to Nearest Minute and Nearest Hours With Using Functions
SQL Server: Group by Clause to Get Comma-Separated Values
Sql: Find Missing Ids in a Table
MySQL - Subtracting Value from Previous Row, Group By