Combine consecutive date ranges

青春壹個敷衍的年華 提交于 2019-12-17 16:37:13

问题


Using SQL Server 2008 R2,

I'm trying to combine date ranges into the maximum date range given that one end date is next to the following start date.

The data is about different employments. Some employees may have ended their employment and have rejoined at a later time. Those should count as two different employments (example ID 5). Some people have different types of employment, running after each other (enddate and startdate neck-to-neck), in this case it should be considered as one employment in total (example ID 30).

An employment period that has not ended has an enddate that is null.

Some examples is probably enlightening:

declare @t as table  (employmentid int, startdate datetime, enddate datetime)

insert into @t values
(5, '2007-12-03', '2011-08-26'),
(5, '2013-05-02', null),
(30, '2006-10-02', '2011-01-16'),
(30, '2011-01-17', '2012-08-12'),
(30, '2012-08-13', null),
(66, '2007-09-24', null)

-- expected outcome
EmploymentId StartDate   EndDate
5            2007-12-03  2011-08-26
5            2013-05-02  NULL
30           2006-10-02  NULL
66           2007-09-24  NULL

I've been trying different "islands-and-gaps" techniques but haven't been able to crack this one.


回答1:


The strange bit you see with my use of the date '31211231' is just a very large date to handle your "no-end-date" scenario. I have assumed you won't really have many date ranges per employee, so I've used a simple Recursive Common Table Expression to combine the ranges.

To make it run faster, the starting anchor query keeps only those dates that will not link up to a prior range (per employee). The rest is just tree-walking the date ranges and growing the range. The final GROUP BY keeps only the largest date range built up per starting ANCHOR (employmentid, startdate) combination.


SQL Fiddle

MS SQL Server 2008 Schema Setup:

create table Tbl (
  employmentid int,
  startdate datetime,
  enddate datetime);

insert Tbl values
(5, '2007-12-03', '2011-08-26'),
(5, '2013-05-02', null),
(30, '2006-10-02', '2011-01-16'),
(30, '2011-01-17', '2012-08-12'),
(30, '2012-08-13', null),
(66, '2007-09-24', null);

/*
-- expected outcome
EmploymentId StartDate   EndDate
5            2007-12-03  2011-08-26
5            2013-05-02  NULL
30           2006-10-02  NULL
66           2007-09-24  NULL
*/

Query 1:

;with cte as (
   select a.employmentid, a.startdate, a.enddate
     from Tbl a
left join Tbl b on a.employmentid=b.employmentid and a.startdate-1=b.enddate
    where b.employmentid is null
    union all
   select a.employmentid, a.startdate, b.enddate
     from cte a
     join Tbl b on a.employmentid=b.employmentid and b.startdate-1=a.enddate
)
   select employmentid,
          startdate,
          nullif(max(isnull(enddate,'32121231')),'32121231') enddate
     from cte
 group by employmentid, startdate
 order by employmentid

Results:

| EMPLOYMENTID |                        STARTDATE |                       ENDDATE |
-----------------------------------------------------------------------------------
|            5 |  December, 03 2007 00:00:00+0000 | August, 26 2011 00:00:00+0000 |
|            5 |       May, 02 2013 00:00:00+0000 |                        (null) |
|           30 |   October, 02 2006 00:00:00+0000 |                        (null) |
|           66 | September, 24 2007 00:00:00+0000 |                        (null) |



回答2:


SET NOCOUNT ON

DECLARE @T TABLE(ID INT,FromDate DATETIME, ToDate DATETIME)

INSERT INTO @T(ID,FromDate,ToDate)
SELECT 1,'20090801','20090803' UNION ALL
SELECT 2,'20090802','20090809' UNION ALL
SELECT 3,'20090805','20090806' UNION ALL
SELECT 4,'20090812','20090813' UNION ALL
SELECT 5,'20090811','20090812' UNION ALL
SELECT 6,'20090802','20090802'


SELECT ROW_NUMBER() OVER(ORDER BY s1.FromDate) AS ID,
       s1.FromDate, 
       MIN(t1.ToDate) AS ToDate 
FROM @T s1 
INNER JOIN @T t1 ON s1.FromDate <= t1.ToDate 
  AND NOT EXISTS(SELECT * FROM @T t2 
                 WHERE t1.ToDate >= t2.FromDate
                   AND t1.ToDate < t2.ToDate) 
WHERE NOT EXISTS(SELECT * FROM @T s2 
                 WHERE s1.FromDate > s2.FromDate
                   AND s1.FromDate <= s2.ToDate) 
GROUP BY s1.FromDate 
ORDER BY s1.FromDate



回答3:


A modified script for combining all overlapping periods.
For example
01.01.2001-01.01.2010
05.05.2005-05.05.2015

will give one period:
01.01.2001-05.05.2015

tbl.enddate must be completed

;WITH cte
  AS(
SELECT
  a.employmentid
  ,a.startdate
  ,a.enddate
from tbl a
left join tbl c on a.employmentid=c.employmentid
    and a.startdate > c.startdate
    and a.startdate <= dateadd(day, 1, c.enddate)
WHERE c.employmentid IS NULL

UNION all

SELECT
  a.employmentid
  ,a.startdate
  ,a.enddate
from cte a
inner join tbl c on a.startdate=c.startdate
    and (c.startdate = dateadd(day, 1, a.enddate) or (c.enddate > a.enddate and c.startdate <= a.enddate))
)
select distinct employmentid,
          startdate,
          nullif(max(enddate),'31.12.2099') enddate
from cte
group by employmentid, startdate


来源:https://stackoverflow.com/questions/15783315/combine-consecutive-date-ranges

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!