When should I use Cross Apply over Inner Join?

What is the main purpose of using CROSS APPLY?

I have read (vaguely, through posts on the Internet) that cross apply can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)

I also know that CROSS APPLY doesn't require a UDF as the right-table.

In most INNER JOIN queries (one-to-many relationships), I could rewrite them to use CROSS APPLY , but they always give me equivalent execution plans.

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?


Edit:

Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply is faster/more efficient)

create table Company (
    companyId int identity(1,1)
,   companyName varchar(100)
,   zipcode varchar(10) 
,   constraint PK_Company primary key (companyId)
)
GO

create table Person (
    personId int identity(1,1)
,   personName varchar(100)
,   companyId int
,   constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId)
,   constraint PK_Person primary key (personId)
)
GO

insert Company
select 'ABC Company', '19808' union
select 'XYZ Company', '08534' union
select '123 Company', '10016'


insert Person
select 'Alan', 1 union
select 'Bobby', 1 union
select 'Chris', 1 union
select 'Xavier', 2 union
select 'Yoshi', 2 union
select 'Zambrano', 2 union
select 'Player 1', 3 union
select 'Player 2', 3 union
select 'Player 3', 3 


/* using CROSS APPLY */
select *
from Person p
cross apply (
    select *
    from Company c
    where p.companyid = c.companyId
) Czip

/* the equivalent query using INNER JOIN */
select *
from Person p
inner join Company c on p.companyid = c.companyId

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?

See the article in my blog for detailed performance comparison:

  • INNER JOIN vs. CROSS APPLY
  • CROSS APPLY works better on things that have no simple JOIN condition.

    This one selects 3 last records from t2 for each record from t1 :

    SELECT  t1.*, t2o.*
    FROM    t1
    CROSS APPLY
            (
            SELECT  TOP 3 *
            FROM    t2
            WHERE   t2.t1_id = t1.id
            ORDER BY
                    t2.rank DESC
            ) t2o
    

    It cannot be easily formulated with an INNER JOIN condition.

    You could probably do something like that using CTE 's and window function:

    WITH    t2o AS
            (
            SELECT  t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn
            FROM    t2
            )
    SELECT  t1.*, t2o.*
    FROM    t1
    INNER JOIN
            t2o
    ON      t2o.t1_id = t1.id
            AND t2o.rn <= 3
    

    , but this is less readable and probably less efficient.

    Update:

    Just checked.

    master is a table of about 20,000,000 records with a PRIMARY KEY on id .

    This query:

    WITH    q AS
            (
            SELECT  *, ROW_NUMBER() OVER (ORDER BY id) AS rn
            FROM    master
            ),
            t AS 
            (
            SELECT  1 AS id
            UNION ALL
            SELECT  2
            )
    SELECT  *
    FROM    t
    JOIN    q
    ON      q.rn <= t.id
    

    runs for almost 30 seconds, while this one:

    WITH    t AS 
            (
            SELECT  1 AS id
            UNION ALL
            SELECT  2
            )
    SELECT  *
    FROM    t
    CROSS APPLY
            (
            SELECT  TOP (t.id) m.*
            FROM    master m
            ORDER BY
                    id
            ) q
    

    is instant.


    cross apply sometimes enables you to do things that you cannot do with inner join .

    Example (a syntax error):

    select F.* from sys.objects O  
    inner join dbo.myTableFun(O.name) F   
    on F.schema_id= O.schema_id
    

    This is a syntax error , because, when used with inner join , table functions can only take variables or constants as parameters. (Ie, the table function parameter cannot depend on another table's column.)

    However:

    select F.* from sys.objects O  
    cross apply ( select * from dbo.myTableFun(O.name) ) F  
    where F.schema_id= O.schema_id
    

    This is legal.

    Edit: Or alternatively, shorter syntax: (by ErikE)

    select F.* from sys.objects O  
    cross apply dbo.myTableFun(O.name) F
    where F.schema_id= O.schema_id
    

    Edit:

    Note: Informix 12.10 xC2+ has Lateral Derived Tables and Postgresql (9.3+) has Lateral Subqueries which can be used to a similar effect.


    Consider you have two tables.

    MASTER TABLE

    x------x--------------------x
    | Id   |        Name        |
    x------x--------------------x
    |  1   |          A         |
    |  2   |          B         |
    |  3   |          C         |
    x------x--------------------x
    

    DETAILS TABLE

    x------x--------------------x-------x
    | Id   |      PERIOD        |   QTY |
    x------x--------------------x-------x
    |  1   |   2014-01-13       |   10  |
    |  1   |   2014-01-11       |   15  |
    |  1   |   2014-01-12       |   20  |
    |  2   |   2014-01-06       |   30  |
    |  2   |   2014-01-08       |   40  |
    x------x--------------------x-------x
    

    There are many situations where we need to replace INNER JOIN with CROSS APPLY .

    1. Join two tables based on TOP n results

    Consider if we need to select Id and Name from Master and last two dates for each Id from Details table .

    SELECT M.ID,M.NAME,D.PERIOD,D.QTY
    FROM MASTER M
    INNER JOIN
    (
        SELECT TOP 2 ID, PERIOD,QTY 
        FROM DETAILS D      
        ORDER BY CAST(PERIOD AS DATE)DESC
    )D
    ON M.ID=D.ID
    
  • SQL FIDDLE
  • The above query generates the following result.

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-12   |  20   |
    x------x---------x--------------x-------x
    

    See, it generated results for last two dates with last two date's Id and then joined these records only in the outer query on Id , which is wrong. To accomplish this, we need to use CROSS APPLY .

    SELECT M.ID,M.NAME,D.PERIOD,D.QTY
    FROM MASTER M
    CROSS APPLY
    (
        SELECT TOP 2 ID, PERIOD,QTY 
        FROM DETAILS D  
        WHERE M.ID=D.ID
        ORDER BY CAST(PERIOD AS DATE)DESC
    )D
    
  • SQL FIDDLE
  • and forms the following result.

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-12   |  20   |
    |   2  |   B     | 2014-01-08   |  40   |
    |   2  |   B     | 2014-01-06   |  30   |
    x------x---------x--------------x-------x
    

    Here's how it works. The query inside CROSS APPLY can reference the outer table, where INNER JOIN cannot do this (it throws compile error). When finding the last two dates, joining is done inside CROSS APPLY ie, WHERE M.ID=D.ID .

    2. When we need INNER JOIN functionality using functions.

    CROSS APPLY can be used as a replacement with INNER JOIN when we need to get result from Master table and a function .

    SELECT M.ID,M.NAME,C.PERIOD,C.QTY
    FROM MASTER M
    CROSS APPLY dbo.FnGetQty(M.ID) C
    

    And here is the function

    CREATE FUNCTION FnGetQty 
    (   
        @Id INT 
    )
    RETURNS TABLE 
    AS
    RETURN 
    (
        SELECT ID,PERIOD,QTY 
        FROM DETAILS
        WHERE ID=@Id
    )
    
  • SQL FIDDLE
  • which generated the following result

    x------x---------x--------------x-------x
    |  Id  |   Name  |   PERIOD     |  QTY  |
    x------x---------x--------------x-------x
    |   1  |   A     | 2014-01-13   |  10   |
    |   1  |   A     | 2014-01-11   |  15   |
    |   1  |   A     | 2014-01-12   |  20   |
    |   2  |   B     | 2014-01-06   |  30   |
    |   2  |   B     | 2014-01-08   |  40   |
    x------x---------x--------------x-------x
    

    ADDITIONAL ADVANTAGE OF CROSS APPLY

    APPLY can be used as a replacement for UNPIVOT . Either CROSS APPLY or OUTER APPLY can be used here, which are interchangeable.

    Consider you have the below table(named MYTABLE ).

    x------x-------------x--------------x
    |  Id  |   FROMDATE  |   TODATE     |
    x------x-------------x--------------x
    |   1  |  2014-01-11 | 2014-01-13   | 
    |   1  |  2014-02-23 | 2014-02-27   | 
    |   2  |  2014-05-06 | 2014-05-30   | 
    |   3  |     NULL    |    NULL      |
    x------x-------------x--------------x
    

    The query is below.

    SELECT DISTINCT ID,DATES
    FROM MYTABLE 
    CROSS APPLY(VALUES (FROMDATE),(TODATE))
    COLUMNNAMES(DATES)
    
  • SQL FIDDLE
  • which brings you the result

      x------x-------------x
      | Id   |    DATES    |
      x------x-------------x
      |  1   |  2014-01-11 |
      |  1   |  2014-01-13 |
      |  1   |  2014-02-23 |
      |  1   |  2014-02-27 |
      |  2   |  2014-05-06 |
      |  2   |  2014-05-30 | 
      |  3   |    NULL     | 
      x------x-------------x
    
    链接地址: http://www.djcxy.com/p/2470.html

    上一篇: 为什么这个程序被三个C ++编译器错误地拒绝了?

    下一篇: 什么时候应该使用交叉套用内部联结?