When Should I Use Cross Apply Over Inner Join

When should I use CROSS APPLY over INNER JOIN?

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?

See the article in my blog for detailed performance comparison:

  • INNER JOIN vs. CROSS APPLY

CROSS APPLY works better on things that have no simple JOIN condition.

This one selects 3 last records from t2 for each record from t1:

SELECT  t1.*, t2o.*
FROM t1
CROSS APPLY
(
SELECT TOP 3 *
FROM t2
WHERE t2.t1_id = t1.id
ORDER BY
t2.rank DESC
) t2o

It cannot be easily formulated with an INNER JOIN condition.

You could probably do something like that using CTE's and window function:

WITH    t2o AS
(
SELECT t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
FROM t2
)
SELECT t1.*, t2o.*
FROM t1
INNER JOIN
t2o
ON t2o.t1_id = t1.id
AND t2o.rn <= 3

, but this is less readable and probably less efficient.

Update:

Just checked.

master is a table of about 20,000,000 records with a PRIMARY KEY on id.

This query:

WITH    q AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS rn
FROM master
),
t AS
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
JOIN q
ON q.rn <= t.id

runs for almost 30 seconds, while this one:

WITH    t AS 
(
SELECT 1 AS id
UNION ALL
SELECT 2
)
SELECT *
FROM t
CROSS APPLY
(
SELECT TOP (t.id) m.*
FROM master m
ORDER BY
id
) q

is instant.

Benefits of INNER JOIN over CROSS APPLY

You can always use a CROSS APPLY where you'd use an INNER JOIN. But there are reasons you might (and often will) prefer INNER JOIN.

In case the two are equivalent the SQL Server optimizer does not treat them differently in my experience. Therefore, I do not follow the suggestion that a CROSS APPLY is faster. If apples are compared to apples the performance is, in all the query plans I have seen, identical.

INNER JOIN is more convenient to write. Also, it is idiomatic. Therefore, it is most legible and maintainable. INNER JOIN also is more widely supported although that probably does not matter on SQL Server. I also estimate that many developers simply do not know CROSS APPLY.

Difference between cross apply and inner join based on below example

If my understanding of cross apply is correct the reason you are getting different results here is that CROSS APPLY will apply whatever comes after apply (the right operator) to every row in the left operator (#temp). This means that the number of rows in the result will be the same as the number of rows in #temp. Basing my answer off of "The table-valued function acts as the right input and the outer table expression acts as the left input. The right input is evaluated for each row from the left input and the rows produced are combined for the final output." from https://technet.microsoft.com/en-us/library/ms175156(v=sql.105).aspx.

Note that if you really wanted the results to be the same you could change your cross apply query to this:

SELECT *
FROM #temp t1
CROSS apply (SELECT Sum(col1) AS col1
FROM #temp2 t2
WHERE t1.col1 = t2.col1) A
WHERE A.col1 IS NOT NULL

Also, note that INNER JOIN is defined to be where there is a match on both sides. In your case this means that there were only 3 rows. If you had used a LEFT OUTER join instead you would have gotten the same results in this case as the CROSS APPLY.

Why cross apply is faster than inner join?

CROSS APPLY is meant to be used with functions and tables that return result based on parameters.

So, the fact you querying function is the reason why "CROSS APPLY" is faster.

Why CROSS APPLY and INNER JOIN returns different result

Both of the query are not the same.

In Query 1, you wanted the max(QuoteID) group by controlNo

In Query 2, you are getting the max(QuoteID) for each controlNo

If you wanted equivalent using CROSS APPLY, should be

select  * 
from #test1 t
cross apply
(
select max(tt.QuoteID) as QuoteID, tt.controlNo
from #test2 tt
group by tt.controlNo
having max(QuoteID) = t.QuoteID
) a

INNER JOIN vs CROSS JOIN vs CROSS APPLY

The first two are equivalent. Whether you use an inner join or cross join is really a matter of preference in this case. I think I would typically use the cross join, because there is no real join condition between the tables.

Note: You should never use cross join when the intention is a "real" inner join that has matching conditions between the tables.

The cross apply is not doing the same thing. It is only choosing one row. If your intention is to get at most one matching row, then use cross apply. If the intention is to get exactly one matching row, then use outer apply.

When we go for cross apply and when we go for inner join in SQL Server 2012

INNER JOIN and CROSS APPLY (same with LEFT JOIN and OUTER APPLY) are very closely related. In your example I'd assume, that the engine will find the same execution plan.

  • A JOIN is a link between two sets over a condition
  • an APPLY is a row-wise sub-call

But - as mentioned above - the optimizer is very smart and will - at least in such easy cases - understand, that it comes down to the same.

  • The JOIN will try to collect the sub-set and link it over the specified condition
  • The APPLY will try to call the related result with the current row's values over and over.

Differences are in calling table-valued-functions (should be inline-syntax!), with XML-method .nodes() and with more complex scenarios.

One example how one could use APPLY to simulate variables

...to use the result of a row-wise calculation like you'd use a variable:

DECLARE @dummy TABLE(ID INT IDENTITY, SomeString VARCHAR(100));
INSERT INTO @dummy VALUES('Want to split/this at the two/slashes.'),('And/this/also');

SELECT d.ID
,d.SomeString
,pos1
,pos2
,LEFT(d.SomeString,pos1-1)
,SUBSTRING(d.SomeString,pos1+1,pos2-pos1-1)
,SUBSTRING(d.SomeString,pos2+1,1000)
FROM @dummy AS d
CROSS APPLY(SELECT CHARINDEX('/',d.SomeString) AS pos1) AS x
CROSS APPLY(SELECT CHARINDEX('/',d.SomeString,x.pos1+1) AS pos2) AS y

This is the same as the following, but much easier to read (and type):

SELECT d.ID
,d.SomeString
,LEFT(d.SomeString,CHARINDEX('/',d.SomeString)-1)
,SUBSTRING(d.SomeString,CHARINDEX('/',d.SomeString)+1,CHARINDEX('/',d.SomeString,(CHARINDEX('/',d.SomeString)+1))-(CHARINDEX('/',d.SomeString)+1))
,SUBSTRING(d.SomeString,CHARINDEX('/',d.SomeString,(CHARINDEX('/',d.SomeString)+1))+1,1000)
FROM @dummy AS d

One example with XML-method .nodes()

DECLARE @dummy TABLE(SomeXML XML)
INSERT INTO @dummy VALUES
(N'<root>
<a>a1</a>
<a>a2</a>
<a>a3</a>
<b>Here is b!</b>
</root>');

SELECT All_a_nodes.value(N'.',N'nvarchar(max)')
FROM @dummy
CROSS APPLY SomeXML.nodes(N'/root/a') AS A(All_a_nodes);

The result

a1
a2
a3

And one example for an inlined function call

CREATE FUNCTION dbo.TestProduceRows(@i INT)
RETURNS TABLE
AS
RETURN
SELECT TOP(@i) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS Nr FROM master..spt_values
GO

CREATE TABLE dbo.TestData(ID INT IDENTITY, SomeString VARCHAR(100),Number INT);
INSERT INTO dbo.TestData VALUES
('Show me once',1)
,('Show me twice',2)
,('Me five times!',5);

SELECT *
FROM TestData
CROSS APPLY dbo.TestProduceRows(Number) AS x;

GO
DROP TABLE dbo.TestData;
DROP FUNCTION dbo.TestProduceRows;

The result

1   Show me once    1   1
2 Show me twice 2 1
2 Show me twice 2 2
3 Me five times! 5 1
3 Me five times! 5 2
3 Me five times! 5 3
3 Me five times! 5 4
3 Me five times! 5 5

Real life example, when to use OUTER / CROSS APPLY in SQL

Some uses for APPLY are...

1) Top N per group queries (can be more efficient for some cardinalities)

SELECT pr.name,
pa.name
FROM sys.procedures pr
OUTER APPLY (SELECT TOP 2 *
FROM sys.parameters pa
WHERE pa.object_id = pr.object_id
ORDER BY pr.name) pa
ORDER BY pr.name,
pa.name

2) Calling a Table Valued Function for each row in the outer query

SELECT *
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle)

3) Reusing a column alias

SELECT number,
doubled_number,
doubled_number_plus_one
FROM master..spt_values
CROSS APPLY (SELECT 2 * CAST(number AS BIGINT)) CA1(doubled_number)
CROSS APPLY (SELECT doubled_number + 1) CA2(doubled_number_plus_one)

4) Unpivoting more than one group of columns

Assumes 1NF violating table structure....

CREATE TABLE T
(
Id INT PRIMARY KEY,

Foo1 INT, Foo2 INT, Foo3 INT,
Bar1 INT, Bar2 INT, Bar3 INT
);

Example using 2008+ VALUES syntax.

SELECT Id,
Foo,
Bar
FROM T
CROSS APPLY (VALUES(Foo1, Bar1),
(Foo2, Bar2),
(Foo3, Bar3)) V(Foo, Bar);

In 2005 UNION ALL can be used instead.

SELECT Id,
Foo,
Bar
FROM T
CROSS APPLY (SELECT Foo1, Bar1
UNION ALL
SELECT Foo2, Bar2
UNION ALL
SELECT Foo3, Bar3) V(Foo, Bar);


Related Topics



Leave a reply



Submit