问题
I'm trying to solve this issue in SQL 2008. I've a table like this:
DECLARE @table TABLE (
TimeStamp DATETIME,
val INT,
typerow VARCHAR(3)
);
INSERT INTO @table(TimeStamp, val, typerow)
VALUES
('2018-06-03 13:30:00.000', 6, 'out'),
('2018-06-03 14:10:00.000', 8, 'out'),
('2018-06-03 14:30:00.000', 3, 'in'),
('2018-06-03 15:00:00.000', 9, 'out'),
('2018-06-03 15:30:00.000', 4, 'out'),
('2018-06-03 16:00:00.000', 2, 'out'),
('2018-06-03 17:05:00.000', 8, 'in'),
('2018-06-03 17:30:00.000', 0, 'out'),
('2018-06-03 18:15:00.000', 7, 'out'),
('2018-06-03 18:30:00.000', 1, 'in'),
('2018-06-03 19:00:00.000', 5, 'out')
This table contains distinct TimeStamp with relative values val and a binary column ('in'/'out') typerow.
Considering @table sorted by TimeStamp ascending, I need to figure a way to get a table in which every row with typerow = 'in'
contains in val column its current value plus the sum of all previous integer in val field where typerow = 'out'
, until the previous typerow = 'in'
record.
Naturally for the first record with typerow = 'in'
, the sum will be extended until the first record of @table
2018-06-03 13:30:00.000 6 out
2018-06-03 14:10:00.000 8 out
2018-06-03 14:30:00.000 17 in -- 6 + 8 + 3
2018-06-03 15:00:00.000 9 out
2018-06-03 15:30:00.000 4 out
2018-06-03 16:00:00.000 2 out
2018-06-03 17:05:00.000 23 in -- 9 + 4 + 2 + 8
2018-06-03 17:30:00.000 0 out
2018-06-03 18:15:00.000 7 out
2018-06-03 18:30:00.000 8 in -- 0 + 7 + 1
2018-06-03 19:00:00.000 5 out
Considering @table will have hundreds of records made in this way, my first idea is to create a new id column and associate same id to all records involved in the same summation (maybe it's possible to do that by recursive CTE?) to get this result:
2018-06-03 13:30:00.000 6 out 1
2018-06-03 14:10:00.000 8 out 1
2018-06-03 14:30:00.000 17 in 1
2018-06-03 15:00:00.000 9 out 2
2018-06-03 15:30:00.000 4 out 2
2018-06-03 16:00:00.000 2 out 2
2018-06-03 17:05:00.000 23 in 2
2018-06-03 17:30:00.000 0 out 3
2018-06-03 18:15:00.000 7 out 3
2018-06-03 18:30:00.000 8 in 3
2018-06-03 19:00:00.000 5 out don't care for this element
and have a new column like
SELECT SUM(vals) OVER (PARTITION BY id ORDER BY id) AS partial_sum
updating val column with partial_sum where typerow = 'in'
.
I don't know how create new id column correctly and if this is a good solution, considering also my SQL Server version.
Thanks in advance for your support, any suggestion is appreciated.
回答1:
This is a gaps-and-islands problem, where each island ends with an "in" record, and you want to sum the values in each island.
Here is one approach that uses the count of following "in"s to define the group, and then a window sum over each group.
select timestamp,
case when val = 'out'
then val
else sum(val) over(partition by grp order by timestamp)
end as val,
typerow
from (
select t.*,
sum(case when typerow = 'in' then 1 else 0 end) over(order by timestamp desc) grp
from @table t
) t
order by timestamp
Demo on DB Fiddle:
timestamp | val | typerow :---------------------- | --: | :------ 2018-06-03 13:30:00.000 | 6 | out 2018-06-03 14:10:00.000 | 8 | out 2018-06-03 14:30:00.000 | 17 | in 2018-06-03 15:00:00.000 | 9 | out 2018-06-03 15:30:00.000 | 4 | out 2018-06-03 16:00:00.000 | 2 | out 2018-06-03 17:05:00.000 | 23 | in 2018-06-03 17:30:00.000 | 0 | out 2018-06-03 18:15:00.000 | 7 | out 2018-06-03 18:30:00.000 | 8 | in 2018-06-03 19:00:00.000 | 5 | out
来源:https://stackoverflow.com/questions/64491255/partial-sum-between-different-records-using-sql-2008