Is there a performance difference in using a GROUP BY with MAX() as the aggregate vs ROW_NUMBER over partition by?

前端 未结 3 1512
生来不讨喜
生来不讨喜 2021-01-18 00:15

Is there a performance difference between the following 2 queries, and if so, then which one is better?:

    select 
    q.id, 
    q.name 
    from(
                


        
3条回答
  •  灰色年华
    2021-01-18 00:29

    I had a table of about 4.5M rows, and I wrote both a MAX with GROUP BY as well as a ROW_NUMBER solution and tested them both. The MAX requires two clustered scans of the table, one to aggregate, and a second to join to the rest of the columns whereas ROW_NUMBER only needed one. (Obviously one or both of these could be indexed to minimize IO, but the point is that GROUP BY requires two index scans.)

    According to the optimizer, in my case the ROW_NUMBER is about 60% more efficient according to the subtree cost. And according to statistics IO, about 20% less CPU time. However, in real elapsed time, the ROW_NUMBER solution takes about 80% more real time. So the GROUP BY wins in my case.

    This seems to match the other answers here.

提交回复
热议问题