SQL Server : Consecutive Rows Issue

问题

I'm having difficulty with a consecutive rows query, and need some assistance.

I have a table that is part of a telephony switch. It tracks every employee who logs into a phone (and the phones extension). I have another table that tells me where that phone is (across multiple sites).

The logic we're going for is if someone logs into a particular phone/site for 2 consecutive days, we'll trigger an event to update their employee record to make this site their primary location.

The issue I'm having with my partitioning is that even if the data isn't truly consecutive, my count is increasing.

If you consider the following table:

row_date    logid   extn    duration    sitecode    daysconsecutive
----------------------------------------------------------------------------    
2014-05-22  500001  414128  9.82    W   1
2014-05-27  500001  414120  10.74   W   2
2014-05-28  500001  414149  5.47    W   3
2014-05-28  500001  414126  6.18    W   4
2014-05-29  500001  414128  11.80   W   5
2014-05-30  500001  414128  2.21    W   6
2014-05-30  500001  414150  5.47    N   1
2014-05-31  500001  414128  4.57    W   7
2014-06-02  500001  414150  4.94    N   2

As you can see on 5/30 this person moved from site W to N. Then on 5/31 they went back to site W -- but this row should be a 1, not a 7, as it's not contiguous with the 6 consecutive days he was at site W from 5/22 - 5/30.

Once I have this ironed out, I'll select the max entry where daysconsecutive>=2 and use that to fire the update.

Any help you can offer would be appreciated.

This is for SQL Server 2008 R2.

Cheers.

Fiddle here.

create table haglog (row_date datetime, logid int, extn int, duration decimal(10,2), sitecode varchar(10))

insert into haglog 
select '2014-05-22',500001,414128,9.82,'W' union all
select '2014-05-27',500001,414120,10.74,'W' union all
select '2014-05-28',500001,414149,5.47,'W' union all
select '2014-05-28',500001,414126,6.18,'W' union all
select '2014-05-29',500001,414128,11.80,'W' union all
select '2014-05-30',500001,414128,2.21,'W' union all
select '2014-05-30',500001,414150,5.47,'N' union all
select '2014-05-31',500001,414128,4.57,'W' union all
select '2014-06-02',500001,414150,4.94,'N'

;with consecutivecte
AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY logid, sitecode ORDER BY row_date) AS daysconsecutive
FROM haglog
)

select * 
from consecutivecte
where logid=500001
order by row_date

回答1:

You can use a gaps-and-islands technique to isolate runs of consecutive days. If you have two rowcounts, one partitioned by sitecode and one unpartitioned, then the difference between the two is invariant within a consecutive run of the same sitecode. This can be used to distinguish between different runs and generate a running count for each.

WITH
  t1 AS (
    SELECT *,
      ROW_NUMBER() OVER(PARTITION BY logid, sitecode ORDER BY row_date)
      - ROW_NUMBER() OVER(PARTITION BY logid ORDER BY row_date) consecutive_group_id
    FROM @haglog
  ),
  t2 AS (
    SELECT *,
      ROW_NUMBER() OVER(PARTITION BY logid, sitecode, consecutive_group_id ORDER BY row_date) daysconsecutive
    FROM t1
  )
SELECT row_date, logid, extn, duration, sitecode, daysconsecutive
FROM t2
ORDER BY row_date

回答2:

I would use cross apply (select top(1) ...) instead of CTE to query previous record for each record in dataset. And limit it by one last day.

So you will get say yesterday records and for each of them previous records by the same user.

select h.* , t.* 
from haglog h
    cross apply ( select top(1) * from haglog t where ...) as t
where logid=500001 and date = @yesterday

来源：https://stackoverflow.com/questions/24018840/sql-server-consecutive-rows-issue

标签

sql-server

sql-server-2008-r2

gaps-and-islands