问题
Suppose we have this simple schema and data:
DROP TABLE #builds
CREATE TABLE #builds (
Id INT IDENTITY(1,1) NOT NULL,
StartTime INT,
IsPassed BIT
)
INSERT INTO #builds (StartTime, IsPassed) VALUES
(1, 1),
(7, 1),
(10, 0),
(15, 1),
(21, 1),
(26, 0),
(34, 0),
(44, 0),
(51, 1),
(60, 1)
SELECT StartTime, IsPassed, NextStartTime,
CASE IsPassed WHEN 1 THEN 0 ELSE NextStartTime - StartTime END Duration
FROM (
SELECT
LEAD(StartTime) OVER (ORDER BY StartTime) NextStartTime,
StartTime, IsPassed
FROM #builds
) x
ORDER BY StartTime
It produces the following result set:
StartTime IsPassed NextStartTime Duration
1 1 7 0
7 1 10 0
10 0 15 5
15 1 21 0
21 1 26 0
26 0 34 8
34 0 44 10
44 0 51 7
51 1 60 0
60 1 NULL 0
I need to summarize the non zero consecutive Duration values and to show them at the StartTime of the first row in the batch. I.e. I need to get to this:
StartTime Duration
10 5
26 25
I just can't figure out how to do it.
PS: The real table contains many more rows, of course.
回答1:
This is a gaps and islands problem, requiring partitioning each section where IsPassed
is constant into a different group. That can be done by computing the difference between ROW_NUMBER()
over the entire table against partitioned by IsPassed
. You can then SUM
the Duration
Values for each group where IsPassed = False
and take the MIN(StartTime)
to give the StartTime
of the first row of the group:
WITH CTE AS (
SELECT StartTime, IsPassed,
LEAD(StartTime) OVER (ORDER BY StartTime) AS NextStartTime
FROM #builds
),
CTE2 AS (
SELECT StartTime, IsPassed, NextStartTime,
CASE IsPassed WHEN 1 THEN 0 ELSE NextStartTime - StartTime END Duration,
ROW_NUMBER() OVER (ORDER BY StartTime) -
ROW_NUMBER() OVER (PARTITION BY IsPassed ORDER BY StartTime) AS grp
FROM CTE
)
SELECT MIN(StartTime) AS StartTime, SUM(Duration) AS Duration
FROM CTE2
WHERE IsPassed = 0
GROUP BY grp
ORDER BY MIN(StartTime)
Output:
StartTime Duration
10 5
26 25
Demo on dbfiddle
回答2:
Your approach is unnecessarily complicated. You simply need to assign the 0
s to groups that include exactly the following 1
.
You can do this by counting the number of "1"s on or after each row. Of course, this also assigns a grouping to the rows with no "0"s. These can be filtered out by ensuring that there is at least on 0
in each group:
select min(StartTime), max(startTime) - min(startTime)
from (select b.*,
sum(case when IsPassed = 1 then 1 else 0 end) over (order by StartTime desc) as grp
from builds b
) b
group by grp
having min(convert(int, IsPassed)) = 0
order by min(StartTime);
Here is a db<>fiddle.
Or an alternative method uses no aggregation at all. It simply gets the next "1" starttime for each row and then filters down to the first "0" row:
select StartTime, next_1_starttime - StartTime
from (select b.*,
lag(IsPassed) over (order by StartTime) as prev_IsPassed,
min(case when IsPassed = 1 then StartTime end) over (order by StartTime desc) as next_1_starttime
from builds b
) b
where IsPassed = 0 and (prev_IsPassed = 1 or prev_IsPassed is null)
order by StartTime;
This probably has the best performance of the alternatives.
来源:https://stackoverflow.com/questions/59889916/tag-consecutive-non-zero-rows-into-distinct-partitions