How can I write a query to extract individual changes from snapshots of data?

后端 未结 2 1878
天命终不由人
天命终不由人 2020-12-20 02:39

I need to create a process that will extract the changes from a table where each row is a snapshot of a row in another table. The real-world problem involves many tables wit

相关标签:
2条回答
  • 2020-12-20 02:57

    Here's a working sample that uses UNPIVOT. It's based on my answer to my question Better way to Partially UNPIVOT in Pairs in SQL

    This has some nice features.

    1. Adding additional fields is easy. Just add values to the SELECT and UNPIVOT clause. You don't have to add additional UNION clauses

    2. The where clause WHERE curr.value <> prev.value never changes regardless of how many fields are added.

    3. The performance is surprisingly fast.

    4. Its portable to Current versions of Oracle if you need that


    SQL

    Declare @Snapshots as table(
    Sequence int,
    DateTaken      datetime,
    [id] int,
    field1 varchar(20),
    field2 int)
    
    
    
    INSERT INTO @Snapshots VALUES 
    
          (1,    '2011-01-01',      1,     'Red',          2),
          (2,    '2011-01-01',      2,     'Blue',        10),
          (3,    '2011-02-01',      1,     'Green',        2),
          (4,    '2011-03-01',      1,     'Green' ,       3),
          (5,    '2011-03-01',      2,     'Purple',       2),
          (6,    '2011-04-01',      1,     'Yellow',       2)
    
    ;WITH Snapshots (Sequence, DateTaken, ID, Field1, Field2, _Index)
    AS
    (
        SELECT Sequence, DateTaken, ID, Field1, Field2, ROW_NUMBER() OVER (ORDER BY ID, Sequence) _Index
        FROM @Snapshots
    )
    ,  data as(
    SELECT
         c._Index
        , c.DateTaken
        ,  c.ID
        , cast(c.Field1  as varchar(max)) Field1
        , cast(p.Field1  as varchar(max))Field1_Previous
        , cast(c.Field2   as varchar(max))Field2
        , cast(p.Field2  as varchar(max)) Field2_Previous 
    
    
    FROM Snapshots c
    JOIN Snapshots p ON p.ID = c.ID AND (p._Index + 1) = c._Index
    )
    
    
    , fieldsToRows 
         AS (SELECT DateTaken, 
                    id,
                    _Index,
                    value,
                    field
    
             FROM   data p UNPIVOT (value FOR field IN (field1, field1_previous, 
                                                            field2, field2_previous) ) 
                    AS unpvt
            ) 
    SELECT 
        curr.DateTaken,
        curr.ID,
        curr.field,
        prev.value previous,
        curr.value 'current'
    
    FROM 
            fieldsToRows curr 
            INNER  JOIN  fieldsToRows prev
            ON curr.ID = prev.id
                AND curr._Index = prev._Index 
                AND curr.field + '_Previous' = prev.field
    WHERE 
        curr.value <> prev.value
    

    Output

    DateTaken               ID          field     previous current
    ----------------------- ----------- --------- -------- -------
    2011-02-01 00:00:00.000 1           Field1    Red      Green
    2011-03-01 00:00:00.000 1           Field2    2        3
    2011-04-01 00:00:00.000 1           Field1    Green    Yellow
    2011-04-01 00:00:00.000 1           Field2    3        2
    2011-03-01 00:00:00.000 2           Field1    Blue     Purple
    2011-03-01 00:00:00.000 2           Field2    10       2
    
    0 讨论(0)
  • 2020-12-20 03:03
    WITH Snapshots (Sequence, DateTaken, ID, Field, FieldValue, _Index) AS
    (
        SELECT
            Sequence,
            DateTaken,
            ID,
            'Field1' AS Field
            CAST(Field1 AS VARCHAR(100)) AS FieldValue,  -- Find an appropriate length
            ROW_NUMBER() OVER (ORDER BY ID, Sequence)
        FROM
            #Snapshots
        UNION ALL
        SELECT
            Sequence,
            DateTaken,
            ID,
            'Field2' AS Field
            CAST(Field2 AS VARCHAR(100)) AS FieldValue,  -- Find an appropriate length
            ROW_NUMBER() OVER (ORDER BY ID, Sequence)
        FROM
            #Snapshots
    )
    SELECT
        S1.DateTaken,
        S1.ID,
        S1.Field,
        S1.FieldValue AS Previous,
        S2.FieldValue As New   -- Not necessarily "Current"
    FROM
        Snapshots S1
    INNER JOIN Snapshots S2 ON
        S2.ID = S1.ID AND
        S2.Field = S1.Field AND
        S2._Index = S1._Index + 1 AND
        S2.FieldValue <> S1.FieldValue    -- Might need to handle NULL values
    
    0 讨论(0)
提交回复
热议问题