问题
I have a daily incoming data with following format:
(id, dat, state, val)
where id
is unique entity, dat
is date (i.e. ISO 'yyyy-MM-dd'), state
is a state of an entity, val
is a value of an entity at this state.
I need to transform it into data like that format: (id, state, val, dat1, dat2, month)
where dat1
and dat2
are datestamps between which the entity id
has state state
with val
value within month.
I'm struggling to find out a correct way to collect and persist data like this format. I understand that solution must meet the following requirements:
1) If id
has changed his state
from st1
to st2
in new daily data(i.e. on 2017-01-27), there should be a row with closed period like that:
(id=1, state=st1, val=2.5, dat1='2017-01-15', dat2='2017-01-26', month='2017-01-01')
and a row for state st2
like that:
(id=1, state=st2, val=5.5, dat1='2017-01-27', dat2='2017-01-31', month='2017-01-01')
2) If on 2017-02-14 there is a new id
for a table, there should be an inserted row like that:
(id=2, state=st3, val=1.5, dat1='2017-02-01', dat2='2017-02-28', month='2017-02-01')
,
so if there is a new id in daily data and this id doesn't exist in previous periods we put his birth date as beginning of the period and hold the closing date as end of month until there is a change of state of this id within February month.
3) A table should be designed in the way allowing having history of table versions before insert\update a new daily piece of data (bi-temporal table?) in order to have an opportunity to discard wrong data and rollback.
4) There should be a query for table which is able to detect overlapping date intervals of the same ID (force ID integrity over time).
5) There should be some necessary indexes for this kind of data. Id should be a foreign key.
6) Should there be a trigger which works when a new piece of data has arrived into load-table and makes operations described in 1),2),3) ?
Could anyone tell the literature or known approaches for this kind of tasks? I believe it is solved task but I couldn't determine the proper name of data model and find any clear examples for that kind of data.
回答1:
Below code updates Entity state table based on "Daily Data". I have created my own sample data, that to the best of my understanding fits your requirements.
CREATE TABLE #DailyData( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME )
CREATE TABLE #EntityState( id INT, [state] VARCHAR( 10 ), val FLOAT, dat1 DATETIME, dat2 DATETIME, [month] DATETIME )
INSERT INTO #EntityState
SELECT 1, 'st2', 4.6, '01-Jul-2017', '08-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 1, 'st1', 1.2, '09-Jul-2017', '31-Jul-2017', '01-Jul-2017' UNION ALL
SELECT 2, 'st1', 2.2, '08-Jul-2017', '31-Jul-2017', '01-Jul-2017'
INSERT INTO #DailyData
SELECT 1, 'st3', 5.6, '15-Jul-2017' UNION ALL
SELECT 1, 'st2', 4.6, '01-Aug-2017' UNION ALL
SELECT 2, 'st2', 2.5, '10-Jul-2017'
SELECT 'Before', * FROM #EntityState
;WITH Q0 AS(
-- Insert current state
SELECT id, [state], val, dat1
FROM
( SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 DESC ) AS LatestDate
FROM #EntityState ) AS CurrentState
WHERE LatestDate = 1
UNION ALL
-- Combine with Daily Import
SELECT id, [state], val, dat1
FROM #DailyData
)
,
-- Add Date order column
Q1 AS (
SELECT id, [state], val, dat1, ROW_NUMBER() OVER( PARTITION BY id ORDER BY dat1 ASC ) AS DateOrder
FROM Q0
)
-- Merge new Entity states with Existing
MERGE INTO #EntityState AS T
USING
-- New Entity states
( SELECT Prev.id, Prev.[state], Prev.val, Prev.dat1, Next.dat1 AS dat2
FROM Q1 AS Prev
LEFT JOIN Q1 AS Next ON Prev.id = Next.id AND Prev.DateOrder + 1 = Next.DateOrder ) AS S
ON T.id = S.id AND T.dat1 = S.dat1 AND T.[state] = S.[state]
-- Update dat2 on last day's state
WHEN MATCHED AND T.dat2 = EOMONTH( T.dat1 )
THEN UPDATE
SET T.dat2 = S.dat2
-- Add new Entity State
WHEN NOT MATCHED BY TARGET THEN
INSERT( id, state, val, dat1, dat2, [month] )
VALUES( S.id, S.state, S.val, S.dat1, ISNULL( S.dat2, EOMONTH( S.dat1 )),
DATEADD( month, DATEDIFF( month, 0, S.dat1 ), 0 ));
SELECT 'After', * FROM #EntityState
This code was adapted from: Count seconds on switch interval SQL Server
来源:https://stackoverflow.com/questions/44979192/sql-construct-and-renew-date-intervals-for-entities-based-on-daily-incoming-data