Cross Join VS Inner Join in Sql

CROSS JOIN vs INNER JOIN in SQL

Cross join does not combine the rows, if you have 100 rows in each table with 1 to 1 match, you get 10.000 results, Innerjoin will only return 100 rows in the same situation.

These 2 examples will return the same result:

Cross join

select * from table1 cross join table2 where table1.id = table2.fk_id

Inner join

select * from table1 join table2 on table1.id = table2.fk_id

Use the last method

In SQL, what's the difference between JOIN and CROSS JOIN?

MySQL doesn't offer a distinction between JOIN and CROSS JOIN. They are the same.

In both your examples the clause

WHERE t1.a3 = t2.a1 

converts any sort of join into an inner join. The standard way of expressing this query is

SELECT t1.a1, t1.a2, t1.a3 
FROM t1
JOIN t2 ON t1.a3 = t2.a1

Performance of inner join compared to cross join

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.

(i) Vendor Specific Performance Guidelines

  1. MySQL Performance Tuning and Optimization Resources
  2. PostgreSQL Performance Optimization

How to use inner join and cross join in one query?

cross join is a special case of inner join: it is an inner join without any conditions. In other words, it joins every row of the table. The only reason it exists is so that you don't have to write a silly statement like inner join TABLE on 1=1.

You should only use cross join when you want to join every single row. It makes no sense to use it and then have conditions in the where clause specifying which rows should match (as you are doing).

The proper way to do it is use inner join, and specify the join conditions in the on clause rather than the where clause. This gives your query more of a logical flow, and in many cases it will also be more efficient (the earlier you exclude rows from your result, the faster your query runs).

Here is a revised query that uses inner join:

select T.PRICE, S.ROW, S.NUMBER, M.TITLE  
from [cinema_no_keys].[dbo].[TICKET] T
inner join [cinema_no_keys].[dbo].[SEAT] S on
T.ID_SEAT = S.ID_SEAT
inner join [cinema_no_keys].[dbo].[SHOW] SH on
SH.DATE_HOUR = T.DATE_HOUR
inner join [cinema_no_keys].[dbo].[MOVIE] M on
M.ID_MOVIE = SH.ID_MOVIE

I changed the order of joins: because movie links to a field in show, show should come before movie.

INNER JOIN vs CROSS JOIN vs CROSS APPLY

The first two are equivalent. Whether you use an inner join or cross join is really a matter of preference in this case. I think I would typically use the cross join, because there is no real join condition between the tables.

Note: You should never use cross join when the intention is a "real" inner join that has matching conditions between the tables.

The cross apply is not doing the same thing. It is only choosing one row. If your intention is to get at most one matching row, then use cross apply. If the intention is to get exactly one matching row, then use outer apply.

What's the difference between a cross join and an inner join with identical filter?

These two queries are functionally identical, as is the following query:

select c1, c2, c3
from t1, t2
where t1.f1 = t2.f2

What follows is my personal opinion:

Always write inner joins with the JOIN ... ON ... or JOIN ... USING (...) syntax.
The advantages are:

  1. It is immediately clear to the reader what you are doing and what the join condition is.

  2. You can never forget to write a join condition, because you are required to write one.

    This protects you from queries that return 1 billion rows instead of 10000 just because you forgot some join conditions, which is a frequent beginner's mistake.

Also note that while for inner joins it doesn't matter if you write a condition in the JOIN or in the WHERE clause, it matters for outer joins.

What is the difference between CROSS JOIN and multiple tables in one FROM?

The first with the comma is an old style from the previous century.

The second with the CROSS JOIN is in newer ANSI JOIN syntax.

And those 2 queries will indeed give the same results.

They both link every record of table "a" against every record of table "b".

So if table "a" has 10 rows, and table "b" has 100 rows.

Then the result would be 10 * 100 = 1000 records.

But why does that first outdated style still exists in some DBMS?

Mostly for backward compatibility reasons, so that some older SQL's don't suddenly break.

Most SQL specialists these days would frown upon someone who still uses that outdated old comma syntax. (although it's often forgiven for an intentional cartesian product)

A CROSS JOIN is a cartesian product JOIN that's lacking the ON clause that defines the relationship between the 2 tables.

In the ANSI JOIN syntax there are also the OUTER joins: LEFT JOIN, RIGHT JOIN, FULL JOIN

And the normal JOIN, aka the INNER JOIN.

Sample Image

But those normally require the ON clause, while a CROSS JOIN doesn't.

And example of a query using different JOIN types.

SELECT *
FROM jars
JOIN apples ON apples.jar_id = jars.id
LEFT JOIN peaches ON peaches.jar_id = jars.id
CROSS JOIN bananas AS bnns
RIGHT JOIN crates ON crates.id = jars.crate_id
FULL JOIN nuts ON nuts.jar_id = jars.id
WHERE jars.name = 'FruityMix'

The nice thing about the JOIN syntax is that the link criteria and the search criteria are separated.

While in the old comma style that difference would be harder to notice. Hence it's easier to forget a link criteria.

SELECT *
FROM crates, jars, apples, peaches, bananas, nuts
WHERE apples.jar_id = jars.id
AND jars.name = 'NuttyFruitBomb'
AND peaches.jar_id = jars.id(+)
AND crates.id(+) = jar.crate_id;

Did you notice that the first query has 1 cartesian product join, but the second has 2? That's why the 2nd is rather nutty.



Related Topics



Leave a reply



Submit