SQL Efficiency: WHERE IN Subquery vs. JOIN then GROUP

后端 未结 7 1797
予麋鹿
予麋鹿 2021-02-07 09:35

As an example, I want to get the list of all items with certain tags applied to them. I could do either of the following:

SELECT Item.ID, Item.Name
FROM Item
WH         


        
7条回答
  •  忘掉有多难
    2021-02-07 09:47

    SELECT Item.ID, Item.Name
    FROM Item
    WHERE Item.ID IN (
        SELECT ItemTag.ItemID
        FROM ItemTag
        WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55)
    

    or

    SELECT Item.ID, Item.Name
    FROM Item
    LEFT JOIN ItemTag ON ItemTag.ItemID = Item.ID
    WHERE ItemTag.TagID = 57 OR ItemTag.TagID = 55
    GROUP BY Item.ID
    

    Your second query won't compile, since it references Item.Name without either grouping or aggregating on it.

    If we remove GROUP BY from the query:

    SELECT  Item.ID, Item.Name
    FROM    Item
    JOIN    ItemTag
    ON      ItemTag.ItemID = Item.ID
    WHERE   ItemTag.TagID = 57 OR ItemTag.TagID = 55
    

    these are still different queries, unless ItemTag.ItemId is a UNIQUE key and marked as such.

    SQL Server is able to detect an IN condition on a UNIQUE column, and will just transform the IN condition into a JOIN.

    If ItemTag.ItemID is not UNIQUE, the first query will use a kind of a SEMI JOIN algorithm, which are quite efficient in SQL Server.

    You can trasform the second query into a JOIN:

    SELECT  Item.ID, Item.Name
    FROM    Item
    JOIN    (
            SELECT DISTINCT ItemID
            FROMT  ItemTag
            WHERE  ItemTag.TagID = 57 OR ItemTag.TagID = 55
            ) tags
    ON      tags.ItemID = Item.ID
    

    but this one is a trifle less efficient than IN or EXISTS.

    See this article in my blog for a more detailed performance comparison:

    • IN vs. JOIN vs. EXISTS

提交回复
热议问题