What is the main purpose of using CROSS APPLY?
I have read (vaguely, through posts on the Internet) that cross apply
can be more efficient when selectin
Here's a brief tutorial that can be saved in a .sql
file and executed in SSMS that I wrote for myself to quickly refresh my memory on how CROSS APPLY
works and when to use it:
-- Here's the key to understanding CROSS APPLY: despite the totally different name, think of it as being like an advanced 'basic join'.
-- A 'basic join' gives the Cartesian product of the rows in the tables on both sides of the join: all rows on the left joined with all rows on the right.
-- The formal name of this join in SQL is a CROSS JOIN. You now start to understand why they named the operator CROSS APPLY.
-- Given the following (very) simple tables and data:
CREATE TABLE #TempStrings ([SomeString] [nvarchar](10) NOT NULL);
CREATE TABLE #TempNumbers ([SomeNumber] [int] NOT NULL);
CREATE TABLE #TempNumbers2 ([SomeNumber] [int] NOT NULL);
INSERT INTO #TempStrings VALUES ('111'); INSERT INTO #TempStrings VALUES ('222');
INSERT INTO #TempNumbers VALUES (111); INSERT INTO #TempNumbers VALUES (222);
INSERT INTO #TempNumbers2 VALUES (111); INSERT INTO #TempNumbers2 VALUES (222); INSERT INTO #TempNumbers2 VALUES (222);
-- Basic join is like CROSS APPLY; 2 rows on each side gives us an output of 4 rows, but 2 rows on the left and 0 on the right gives us an output of 0 rows:
SELECT
st.SomeString, nbr.SomeNumber
FROM -- Basic join ('CROSS JOIN')
#TempStrings st, #TempNumbers nbr
-- Note: this also works:
--#TempStrings st CROSS JOIN #TempNumbers nbr
-- Basic join can be used to achieve the functionality of INNER JOIN by first generating all row combinations and then whittling them down with a WHERE clause:
SELECT
st.SomeString, nbr.SomeNumber
FROM -- Basic join ('CROSS JOIN')
#TempStrings st, #TempNumbers nbr
WHERE
st.SomeString = nbr.SomeNumber
-- However, for increased readability, the SQL standard introduced the INNER JOIN ... ON syntax for increased clarity; it brings the columns that two tables are
-- being joined on next to the JOIN clause, rather than having them later on in the WHERE clause. When multiple tables are being joined together, this makes it
-- much easier to read which columns are being joined on which tables; but make no mistake, the following syntax is *semantically identical* to the above syntax:
SELECT
st.SomeString, nbr.SomeNumber
FROM -- Inner join
#TempStrings st INNER JOIN #TempNumbers nbr ON st.SomeString = nbr.SomeNumber
-- Because CROSS APPLY is generally used with a subquery, the subquery's WHERE clause will appear next to the join clause (CROSS APPLY), much like the aforementioned
-- 'ON' keyword appears next to the INNER JOIN clause. In this sense, then, CROSS APPLY combined with a subquery that has a WHERE clause is like an INNER JOIN with
-- an ON keyword, but more powerful because it can be used with subqueries (or table-valued functions, where said WHERE clause can be hidden inside the function).
SELECT
st.SomeString, nbr.SomeNumber
FROM
#TempStrings st CROSS APPLY (SELECT * FROM #TempNumbers tempNbr WHERE st.SomeString = tempNbr.SomeNumber) nbr
-- CROSS APPLY joins in the same way as a CROSS JOIN, but what is joined can be a subquery or table-valued function. You'll still get 0 rows of output if
-- there are 0 rows on either side, and in this sense it's like an INNER JOIN:
SELECT
st.SomeString, nbr.SomeNumber
FROM
#TempStrings st CROSS APPLY (SELECT * FROM #TempNumbers tempNbr WHERE 1 = 2) nbr
-- OUTER APPLY is like CROSS APPLY, except that if one side of the join has 0 rows, you'll get the values of the side that has rows, with NULL values for
-- the other side's columns. In this sense it's like a FULL OUTER JOIN:
SELECT
st.SomeString, nbr.SomeNumber
FROM
#TempStrings st OUTER APPLY (SELECT * FROM #TempNumbers tempNbr WHERE 1 = 2) nbr
-- One thing CROSS APPLY makes it easy to do is to use a subquery where you would usually have to use GROUP BY with aggregate functions in the SELECT list.
-- In the following example, we can get an aggregate of string values from a second table based on matching one of its columns with a value from the first
-- table - something that would have had to be done in the ON clause of the LEFT JOIN - but because we're now using a subquery thanks to CROSS APPLY, we
-- don't need to worry about GROUP BY in the main query and so we don't have to put all the SELECT values inside an aggregate function like MIN().
SELECT
st.SomeString, nbr.SomeNumbers
FROM
#TempStrings st CROSS APPLY (SELECT SomeNumbers = STRING_AGG(tempNbr.SomeNumber, ', ') FROM #TempNumbers2 tempNbr WHERE st.SomeString = tempNbr.SomeNumber) nbr
-- ^ First the subquery is whittled down with the WHERE clause, then the aggregate function is applied with no GROUP BY clause; this means all rows are
-- grouped into one, and the aggregate function aggregates them all, in this case building a comma-delimited string containing their values.
DROP TABLE #TempStrings;
DROP TABLE #TempNumbers;
DROP TABLE #TempNumbers2;