Selecting SUM of TOP 2 values within a table with multiple GROUP in SQL

為{幸葍}努か 提交于 2019-12-01 03:29:38

The best way to do this in SQL Server is with a common table expression, numbering the rows in each group with the windowing function ROW_NUMBER():

WITH NumberedPeriods AS (
  SELECT HoursCTR, Duration, ROW_NUMBER() 
    OVER (PARTITION BY HoursCTR ORDER BY Duration DESC) AS RN
  FROM #Periods
  WHERE Rest = 1
)
SELECT HoursCTR, SUM(Duration) AS LongestBreaks
FROM NumberedPeriods
WHERE RN <= 2
GROUP BY HoursCTR

edit: I've added an ORDER BY clause in the partitioning, to get the two longest rests.


Mea culpa, I did not notice that you need this to work in Microsoft SQL Server 2000. That version doesn't support CTE's or windowing functions. I'll leave the answer above in case it helps someone else.

In SQL Server 2000, the common advice is to use a correlated subquery:

SELECT p1.HoursCTR, (SELECT SUM(t.Duration) FROM 
    (SELECT TOP 2 p2.Duration FROM #Periods AS p2
     WHERE p2.HoursCTR = p1.HoursCTR 
     ORDER BY p2.Duration DESC) AS t) AS LongestBreaks
FROM #Periods AS p1

SQL 2000 does not have CTE's, nor ROW_NUMBER().
Correlated subqueries can need an extra step when using group by.

This should work for you:

SELECT 
    F.HoursCTR,
    MAX (F.LongestBreaks) AS LongestBreaks -- Dummy max() so that groupby can be used.
FROM
    (
        SELECT 
            Pm.HoursCTR, 
            (
                SELECT 
                    COALESCE (SUM (S.Duration), 0)    
                FROM 
                    (
                        SELECT TOP 2    T.Duration
                        FROM            #Periods    AS T
                        WHERE           T.HoursCTR  = Pm.HoursCTR 
                        AND             T.Rest      = 1
                        ORDER BY        T.Duration  DESC
                    ) AS S
             ) AS LongestBreaks
        FROM
            #Periods AS Pm
    ) AS F
GROUP BY
    F.HoursCTR

Unfortunately for you, Alex, you've got the right solution: correlated subqueries, depending upon how they're structured, will end up firing multiple times, potentially giving you hundreds of individual query executions.

Put your current solution into the Query Analyzer, enable "Show Execution Plan" (Ctrl+K), and run it. You'll have an extra tab at the bottom which will show you how the engine went about the process of gathering your results. If you do the same with the correlated subquery, you'll see what that option does.

I believe that it's likely to hammer the #Periods table about as many times as you have individual rows in that table.

Also - something's off about the correlated subquery, seems to me. Since I avoid them like the plague, knowing that they're evil, I'm not sure how to go about fixing it up.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!