query optimizer operator choice - nested loops vs hash match (or merge)

后端 未结 2 836
一生所求
一生所求 2020-12-14 02:28

One of my stored procedures was taking too long execute. Taking a look at query execution plan I was able to locate the operation taking too long. It was a nested loop physi

相关标签:
2条回答
  • 2020-12-14 02:42

    I would not recommend trying to "fix" the plan by forcing the hints in one direction or another. Instead, you need to look to your indexes, statistics and the TSQL code to understand why you have a Table spool loading up 1.2billion rows from 19000.

    0 讨论(0)
  • 2020-12-14 02:49

    ABSOLUTELY. A hash match would be a huge improvement. Creating the hash on the smaller 19,223 row table then probing into it with the larger 65,991 row table is a much smaller operation than the nested loop requiring 1,268,544,993 row comparisons.

    The only reason the server would choose the nested loops is that it badly underestimated the number of rows involved. Do your tables have statistics on them, and if so, are they being updated regularly? Statistics are what enable the server to choose good execution plans.

    If you've properly addressed statistics and are still having a problem you could force it to use a HASH join like so:

    SELECT *
    FROM
       TableA A -- The smaller table
       LEFT HASH JOIN TableB B -- the larger table
    

    Please note that the moment you do this it will also force the join order. This means you have to arrange all your tables correctly so that their join order makes sense. Generally you would examine the execution plan the server already has and alter the order of your tables in the query to match. If you're not familiar with how to do this, the basics are that each "left" input comes first, and in graphical execution plans, the left input is the lower one. A complex join involving many tables may have to group joins together inside parentheses, or use RIGHT JOIN in order to get the execution plan to be optimal (swap left and right inputs, but introduce the table at the correct point in the join order).

    It is generally best to avoid using join hints and forcing join order, so do whatever else you can first! You could look into the indexes on the tables, fragmentation, reducing column sizes (such as using varchar instead of nvarchar where Unicode is not required), or splitting the query into parts (insert to a temp table first, then join to that).

    0 讨论(0)
提交回复
热议问题