Differencebetween Using a Cross Join and Putting a Comma Between the Two Tables

What is the difference between using a cross join and putting a comma between the two tables?

They return the same results because they are semantically identical. This:

select * 
from A, B

...is (wince) ANSI-89 syntax. Without a WHERE clause to link the tables together, the result is a cartesian product. Which is exactly what alternative provides as well:

    select * 
from A
cross join B

...but the CROSS JOIN is ANSI-92 syntax.

About Performance

There's no performance difference between them.

Why Use ANSI-92?

The reason to use ANSI-92 syntax is for OUTER JOIN support (IE: LEFT, FULL, RIGHT)--ANSI-89 syntax doesn't have any, so many databases implemented their own (which doesn't port to any other databases). IE: Oracle's (+), SQL Server's =*

What is the difference between CROSS JOIN and multiple tables in one FROM?

The first with the comma is an old style from the previous century.

The second with the CROSS JOIN is in newer ANSI JOIN syntax.

And those 2 queries will indeed give the same results.

They both link every record of table "a" against every record of table "b".

So if table "a" has 10 rows, and table "b" has 100 rows.

Then the result would be 10 * 100 = 1000 records.

But why does that first outdated style still exists in some DBMS?

Mostly for backward compatibility reasons, so that some older SQL's don't suddenly break.

Most SQL specialists these days would frown upon someone who still uses that outdated old comma syntax. (although it's often forgiven for an intentional cartesian product)

A CROSS JOIN is a cartesian product JOIN that's lacking the ON clause that defines the relationship between the 2 tables.

In the ANSI JOIN syntax there are also the OUTER joins: LEFT JOIN, RIGHT JOIN, FULL JOIN

And the normal JOIN, aka the INNER JOIN.

Sample Image

But those normally require the ON clause, while a CROSS JOIN doesn't.

And example of a query using different JOIN types.

SELECT *
FROM jars
JOIN apples ON apples.jar_id = jars.id
LEFT JOIN peaches ON peaches.jar_id = jars.id
CROSS JOIN bananas AS bnns
RIGHT JOIN crates ON crates.id = jars.crate_id
FULL JOIN nuts ON nuts.jar_id = jars.id
WHERE jars.name = 'FruityMix'

The nice thing about the JOIN syntax is that the link criteria and the search criteria are separated.

While in the old comma style that difference would be harder to notice. Hence it's easier to forget a link criteria.

SELECT *
FROM crates, jars, apples, peaches, bananas, nuts
WHERE apples.jar_id = jars.id
AND jars.name = 'NuttyFruitBomb'
AND peaches.jar_id = jars.id(+)
AND crates.id(+) = jar.crate_id;

Did you notice that the first query has 1 cartesian product join, but the second has 2? That's why the 2nd is rather nutty.

Is a query with table separated by comma a cross join query?

It would be a cross join if there wasn't a WHERE clause relating the two tables. In this case it's functionally equivalent to an inner join (matching records by id_rel and id)

It's an older syntax for joining tables that is still supported in most systems, but JOIN syntax is largely preferred.

What's the difference between comma separated joins and join on syntax in MySQL?

There is no difference at all.

First representation makes query more readable and makes it look very clear as to which join corresponds to which condition.

Not sure I understand the Snowflake FROM syntax using comma and table(...)

Copying comments to an answer for closure (I'll delete this one if Gordon comes back):

  • In SQL , is equivalent to CROSS JOIN.

That's ANSI-89 vs ANSI-92. https://stackoverflow.com/a/3918601/132438

(choose an explicit join if possible, if you have the choice)

Difference between these two joining table approaches?

Other than syntax, for the small snippet, they work exactly the same. But if at all possible, always write new queries using ANSI-JOINs.

As for semantically, the comma notation is used to produce a CARTESIAN product between two tables, which means produce a matrix of all records from table A with all records from table B, so two tables with 4 and 6 records respectively produces 24 records. Using the WHERE clause, you can then pick the rows you actually want from this cartesian product. However, MySQL doesn't actually follow through and make this huge matrix, but semantically this is what it means.

A JOIN syntax is the ANSI standard that more clearly defines how tables interact. By putting the ON clause next to the JOIN, it makes it clear what links the two tables together.

Functionally, they will perform the same for your two queries. The difference comes in when you start using other [OUTER] JOIN types.

For MySQL specifically, comma-notation does have one difference

STRAIGHT_JOIN is similar to JOIN, except that the left table is always read before the right table. This can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.

However, it would not be wise to bank on this difference.

CROSS JOIN vs INNER JOIN in SQL

Cross join does not combine the rows, if you have 100 rows in each table with 1 to 1 match, you get 10.000 results, Innerjoin will only return 100 rows in the same situation.

These 2 examples will return the same result:

Cross join

select * from table1 cross join table2 where table1.id = table2.fk_id

Inner join

select * from table1 join table2 on table1.id = table2.fk_id

Use the last method

Left join or select from multiple table using comma (,)

First of all, to be completely equivalent, the first query should have been written

   SELECT mw.*, 
nvs.*
FROM mst_words mw
LEFT JOIN (SELECT *
FROM vocab_stats
WHERE owner = 1111) AS nvs ON mw.no = nvs.vocab_no
WHERE (nvs.correct > 0 )
AND mw.level = 1

So that mw.* and nvs.* together produce the same set as the 2nd query's singular *. The query as you have written can use an INNER JOIN, since it includes a filter on nvs.correct.

The general form

TABLEA LEFT JOIN TABLEB ON 

attempts to find TableB records based on the condition. If the fails, the results from TABLEA are kept, with all the columns from TableB set to NULL. In contrast

TABLEA INNER JOIN TABLEB ON 

also attempts to find TableB records based on the condition. However, when fails, the particular record from TableA is removed from the output result set.

The ANSI standard for CROSS JOIN produces a Cartesian product between the two tables.

TABLEA CROSS JOIN TABLEB
-- # or in older syntax, simply using commas
TABLEA, TABLEB

The intention of the syntax is that EACH row in TABLEA is joined to EACH row in TABLEB. So 4 rows in A and 3 rows in B produces 12 rows of output. When paired with conditions in the WHERE clause, it sometimes produces the same behaviour of the INNER JOIN, since they express the same thing (condition between A and B => keep or not). However, it is a lot clearer when reading as to the intention when you use INNER JOIN instead of commas.

Performance-wise, most DBMS will process a LEFT join faster than an INNER JOIN. The comma notation can cause database systems to misinterpret the intention and produce a bad query plan - so another plus for SQL92 notation.

Why do we need LEFT JOIN? If the explanation of LEFT JOIN above is still not enough (keep records in A without matches in B), then consider that to achieve the same, you would need a complex UNION between two sets using the old comma-notation to achieve the same effect. But as previously stated, this doesn't apply to your example, which is really an INNER JOIN hiding behind a LEFT JOIN.

Notes:

  • The RIGHT JOIN is the same as LEFT, except that it starts with TABLEB (right side) instead of A.
  • RIGHT and LEFT JOINS are both OUTER joins. The word OUTER is optional, i.e. it can be written as LEFT OUTER JOIN.
  • The third type of OUTER join is FULL OUTER join, but that is not discussed here.

When is CROSS JOIN useful?

Apart from the simple rule of never using commas in the from clause and always using explicit join syntax, there is a good reason. The issue is the difference between these two queries:

select *
from table1, table2;

and

select *
from table1 table2;

These do very different things, and it can be rather hard to spot the difference (particularly in a more complicated query). If you never have commas in the FROM clause, then your queries will be easier to read and less prone to typos and other problems.

Is this really a cross join?

Is this really a "cross join"?

The question is rather meaningless. SQL is not a procedural language. A SQL query describes the results from processing, not how the processing is accomplished.

The WHERE clause is describing filtering on the full Cartesian product of the tables. The filtering is equivalent to equi-joins. The syntax is not the best way of expressing the query. But almost any query optimizer (including MySQL) is going to implement this in a more reasonable way than generating the Cartesian product and filtering the result.



Related Topics



Leave a reply



Submit