Applying LAG() to multiple rows with a null value

穿精又带淫゛_ 提交于 2019-12-25 00:29:42

问题


Given:

with    
m as (
    select  1 ID, cast('03/01/2015' as datetime) PERIOD_START, cast('3/31/2015' as datetime) PERIOD_END
    union all
    select  1 ID, '04/01/2015', '4/28/2015'
    union all
    select  1 ID, '05/01/2015', '5/31/2015'
    union all
    select  1 ID, '06/01/2015', '06/30/2015'
    union all
    select  1 ID, '07/01/2015', '07/31/2015'
)

,
a as (

    SELECT  1 ID, cast('2015-03-13 14:17:00.000' as datetime) AUDIT_TIME, 'READ [2]' STATUS
    UNION ALL
    SELECT  1 ID, '2015-04-27 15:51:00.000' AUDIT_TIME, 'HELD [2]' STATUS
    UNION ALL
    SELECT  1 ID, '2015-07-08 17:54:00.000' AUDIT_TIME, 'COMPLETED [5]' STATUS
)

This query:

select  m.ID,PERIOD_START,PERIOD_END
        ,a.AUDIT_TIME,STATUS
from    m
LEFT OUTER JOIN a on m.id=a.id 
    and a.audit_time between m.period_start and m.period_end

generates this record set:

ID  PERIOD_START    PERIOD_END  AUDIT_TIME  STATUS
1   2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 READ [2]
1   2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1   2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 NULL    NULL
1   2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 NULL    NULL
1   2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 COMPLETED [5]

I need the 4/27/15 entry repeated for May and June:

ID  PERIOD_START    PERIOD_END  AUDIT_TIME  STATUS
1   2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 READ [2]
1   2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1   2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1   2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1   2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 COMPLETED [5]

Using the LAG() function:

select  m.ID,PERIOD_START,PERIOD_END
        ,a.AUDIT_TIME
        ,LAG(audit_time) OVER (partition by m.ID order by period_start) PRIOR_AUDIT_TIME
        ,STATUS
        ,LAG(STATUS) OVER (partition by m.ID order by period_start) PRIOR_STATUS
from    m
LEFT OUTER JOIN a on m.id=a.id 
    and a.audit_time between m.period_start and m.period_end

only works for a single row:

ID  PERIOD_START    PERIOD_END  AUDIT_TIME  PRIOR_AUDIT_TIME    STATUS  PRIOR_STATUS
1   2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 NULL    READ [2]    NULL
1   2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 2015-03-13 14:17:00.000 HELD [2]    READ [2]
1   2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 NULL    2015-04-27 15:51:00.000 NULL    HELD [2]
1   2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 NULL    NULL    NULL    NULL
1   2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 NULL    COMPLETED [5]   NULL

Is there a way to do this without having to resort to a cursor?


回答1:


You can do this with window functions:

with q as (
      select m.ID, PERIOD_START, PERIOD_END, a.AUDIT_TIME, STATUS
      from m LEFT OUTER JOIN
           a
           on m.id = a.id and
              a.audit_time between m.period_start and m.period_end
     )
select q.*,
       max(status) over (partition by id, audit_grp) as imputed_status
from (select q.*,
             max(audit_time) over (partition by id order by period_start) as audit_grp
      from q
     ) q

The idea is to copy the audit_time value over, using max() as a cumulative window function. This then defines groups, so you can get the status as well.

ANSI supplies the IGNORE NULLSs directive to LAG(), but SQL Server does not (yet) support it.



来源:https://stackoverflow.com/questions/31440623/applying-lag-to-multiple-rows-with-a-null-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!