问题
We are trying to include in our SQL only those rows where the next row is within 12 hours, based on a timestamp. Along with this, we also need to rank the rows in order to identify initial contact followed by the number of contacts within the time frame.
Unfortunately we cannot just look for min() max() within 12 hours as the date range can be longer (months) but the time between contacts can only be 12 hours. A person may have multiple contacts within the over all date range, and the initial contact has a few requirements specific to the business (see the InitialCall cte in the SQL below).
SQL used so far:
set nocount on;
set transaction isolation level read uncommitted;
set datefirst 1;
---------------------------------------------------------------------------------------
declare @FromDate as datetime = '2017-01-30T00:00:00';
declare @ToDate as datetime = '2017-02-05T23:59:59';
---------------------------------------------------------------------------------------
with [InitialCalls] as
(
select
d.PatientRef,
d.CaseRef,
d.PathwaysStartDate as [StartDate],
d.PathwaysFinishDate as [FinishDate]
from dbo.[111Data] as [d]
where d.PathwaysStartDate between @FromDate and @ToDate
and (d.MDSSpeaktoPrimaryCareService = 1 or d.MDSContactPrimaryCareService = 1)
and d.PathwaysDxCode in ('Dx05','Dx06','Dx07','Dx08','Dx11','Dx110','Dx1111','Dx116','Dx117','Dx12','Dx13','Dx14','Dx15','Dx17','Dx18','Dx19','Dx20','Dx21','Dx61','Dx80','Dx85','Dx86','Dx87','Dx93','Dx93')
and d.PathwaysFinalTriage = 1
and d.PathwaysAbandonedTriage = 0
and d.ReferralCategory not in ('All Services Rejected','Unsuccessful Lookup','No DoS Selected')
),
[AllCalls] as
(
select distinct
count(d.CaseRef) over (partition by d.PatientRef) as [CaseVol],
d.PatientRef,
d.CaseRef,
d.PathwaysStartDate as [StartDate],
d.PathwaysFinishDate as [FinishDate]
from dbo.[111Data] as [d]
inner join [InitialCalls] as [ic] on ic.PatientRef = d.PatientRef
where d.PathwaysStartDate between ic.StartDate and dateadd(hour,12, ic.StartDate)
and d.PathwaysFinalTriage = 1
and d.PathwaysAbandonedTriage = 0
and d.PatientRef = 'A3E14866-4DD5-4001-AF63-21819F49B401'
)
select
rank() over (partition by ac.PatientRef order by ac.StartDate) as [Rank],
ac.PatientRef,
ac.CaseRef,
ac.StartDate,
ac.FinishDate,
lag(ac.FinishDate) over (partition by ac.PatientRef order by ac.FinishDate asc) as [PreviousRowFinishDate],
datediff(hour, lag(ac.FinishDate) over (partition by ac.PatientRef order by ac.FinishDate asc), ac.StartDate) as [HoursDifference]
from [AllCalls] as [ac]
where ac.CaseVol > 1
Current output: (Link to current output)
Expected output: (Link to expected output)
In this instance, we would like to not include the very first row (as this does not have a follow on contact within 12 hours), then rank each instance of repeat contacts. This is so we can track how many people called with a specific issue and then called up following it to chase.
EDIT - Table Creation and altered SQL
declare @table as table
(
[CaseRef] uniqueidentifier,
[PatientRef] uniqueidentifier,
[StartDate] datetime,
[FinishDate] datetime
);
insert into @table
(
[CaseRef],
[PatientRef],
[StartDate],
[FinishDate]
)
values
('DB79C49E-938C-4C40-B48E-3389D9339759', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 00:22:41', '2017-01-30 00:28:06'),
('4BFA4E3B-D313-4777-A290-3C13601D5C95', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:00:46', '2017-01-30 22:10:24'),
('F910D4DE-3CEE-4429-8844-DDE860D08192', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:25:49', '2017-01-30 22:27:58'),
('DF28DC91-02E3-47F2-88E0-397C2CBCFE41', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:44:11', '2017-01-30 22:53:22'),
('D6964286-8AE7-46AB-8DA5-88A347015C4D', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:55:17', '2017-01-30 23:01:57'),
('660B2ED7-B715-4A6C-A92B-D80267C0E4F5', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:06:16', '2017-01-30 23:08:28'),
('903AC539-4BB1-44AB-AFDB-D86C13310011', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:15:21', '2017-01-30 23:16:02'),
('75B88E5F-4795-4A21-9EA6-3B41CE958250', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:28:31', '2017-01-30 23:29:53'),
('DD6A4BD5-EF75-44CE-9309-4C14B2A21FF4', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:45:42', '2017-01-30 23:46:13'),
('518319BA-0EDE-46D8-B0B7-E8CEB233DEDF', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:54:02', '2017-01-31 00:03:13'),
('FB5A5A54-E580-40F2-94FD-64E20EA5C4DD', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-31 16:13:01', '2017-01-31 16:21:02'),
('8A4FD0C3-59BF-43AB-A829-F2396D6FB26A', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-31 18:26:14', '2017-01-31 18:39:20'),
('8CB94AF1-9664-4081-A2E1-271ED16B147B', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-01 08:10:41', '2017-02-01 08:18:18'),
('0DC6B68B-0458-48DF-B286-C1A978653981', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-01 15:40:45', '2017-02-01 15:48:24'),
('DB239857-6870-4AD9-8149-69ED6151CCB2', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 16:54:40', '2017-02-02 17:10:27'),
('938CCFF4-66C9-48B1-BDB7-D9144D2BD522', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 19:29:18', '2017-02-02 19:30:14'),
('1EC730D0-AF85-45BF-BD06-12B23124151F', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 19:43:28', '2017-02-02 19:47:12');
set nocount on;
set transaction isolation level read uncommitted;
set datefirst 1;
with [InitialCalls] as
(
select
t.PatientRef,
t.CaseRef,
t.StartDate,
t.FinishDate
from @table as [t]
),
[AllCalls] as
(
select distinct
count(t.CaseRef) over (partition by t.PatientRef) as [CaseVol],
t.PatientRef,
t.CaseRef,
t.StartDate,
t.FinishDate
from @table as [t]
inner join [InitialCalls] as [ic] on ic.PatientRef = t.PatientRef
where t.StartDate between ic.StartDate and dateadd(hour,12, ic.StartDate)
)
select
rank() over (partition by ac.PatientRef order by ac.StartDate) as [Rank],
ac.PatientRef,
ac.CaseRef,
ac.StartDate,
ac.FinishDate,
lag(ac.FinishDate) over (partition by ac.PatientRef order by ac.FinishDate asc) as [PreviousRowFinishDate],
datediff(hour, lag(ac.FinishDate) over (partition by ac.PatientRef order by ac.FinishDate asc), ac.StartDate) as [HoursDifference]
from [AllCalls] as [ac]
where ac.CaseVol > 1;
Final Edit - Answer with lots of help from Vladimir
set nocount on;
set transaction isolation level read uncommitted;
set datefirst 1;
---------------------------------------------------------------------------------------
declare @FromDate as datetime = '2017-01-30T00:00:00';
declare @ToDate as datetime = '2017-02-05T23:59:59';
---------------------------------------------------------------------------------------
with [InitialCalls] as
(
select
d.PatientRef,
d.CaseRef,
d.PathwaysStartDate,
d.PathwaysFinishDate,
d.PathwaysDxCode
from dbo.[111Data] as [d]
where d.PathwaysStartDate between @FromDate and @ToDate
and (d.MDSSpeaktoPrimaryCareService = 1 or d.MDSContactPrimaryCareService = 1)
and d.PathwaysDxCode in ('Dx05','Dx06','Dx07','Dx08','Dx11','Dx110','Dx1111','Dx116','Dx117','Dx12','Dx13','Dx14','Dx15','Dx17','Dx18','Dx19','Dx20','Dx21','Dx61','Dx80','Dx85','Dx86','Dx87','Dx93','Dx93')
and d.PathwaysFinalTriage = 1
and d.PathwaysAbandonedTriage = 0
and d.ReferralCategory not in ('All Services Rejected','Unsuccessful Lookup','No DoS Selected')
),
[AllCalls] as
(
select
d.PatientRef,
d.CaseRef,
d.CaseNumber,
d.PathwaysStartDate,
d.PathwaysFinishDate,
isnull(lag(d.PathwaysStartDate) over (partition by d.PatientRef order by d.PathwaysStartDate), '1900-01-01') as [PreviousStartDate]
from dbo.[111Data] as [d]
inner join [InitialCalls] as [ic] on ic.PatientRef = d.PatientRef
where d.PathwaysStartDate between ic.PathwaysStartDate and dateadd(hour,12, ic.PathwaysStartDate)
and d.PathwaysFinalTriage = 1
and d.PathwaysAbandonedTriage = 0
),
[InitialCallsMarkers] as
(
select
ic.PatientRef,
ic.CaseRef,
ic.CaseNumber,
ic.PathwaysStartDate,
ic.PathwaysFinishDate,
iif(datediff(hour, ic.PreviousStartDate, ic.PathwaysStartDate) >= 12, 1, 0) as [Marker]
from [AllCalls] as [ic]
),
[InitialCallsSequences] as
(
select distinct
icm.PatientRef,
icm.CaseRef,
icm.CaseNumber,
icm.PathwaysStartDate,
icm.PathwaysFinishDate,
icm.Marker,
sum(icm.Marker) over (partition by icm.PatientRef order by icm.PathwaysStartDate rows between unbounded preceding and current row) as [SequenceNumber]
from [InitialCallsMarkers] as [icm]
),
[InitialCallsRanks] as
(
select
ics.PatientRef,
ics.CaseRef,
ics.CaseNumber,
ics.PathwaysStartDate,
ics.PathwaysFinishDate,
ics.SequenceNumber,
ics.Marker,
row_number() over (partition by ics.PatientRef, ics.SequenceNumber order by ics.PathwaysStartDate) as [Rank],
count(*) over (partition by ics.PatientRef, ics.SequenceNumber) as [SequenceLength]
from [InitialCallsSequences] as [ics]
)
select
icr.[Rank],
icr.PatientRef,
icr.CaseRef,
icr.CaseNumber,
icr.PathwaysStartDate,
icr.PathwaysFinishDate,
icr.Marker,
icr.SequenceNumber,
icr.SequenceLength
from [InitialCallsRanks] as [icr]
where icr.SequenceLength > 1
order by icr.PatientRef, icr.PathwaysStartDate;
回答1:
Sample data
declare @table as table
(
[CaseRef] uniqueidentifier,
[PatientRef] uniqueidentifier,
[StartDate] datetime,
[FinishDate] datetime
);
insert into @table
(
[CaseRef],
[PatientRef],
[StartDate],
[FinishDate]
)
values
('DB79C49E-938C-4C40-B48E-3389D9339759', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 00:22:41', '2017-01-30 00:28:06'),
('4BFA4E3B-D313-4777-A290-3C13601D5C95', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:00:46', '2017-01-30 22:10:24'),
('F910D4DE-3CEE-4429-8844-DDE860D08192', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:25:49', '2017-01-30 22:27:58'),
('DF28DC91-02E3-47F2-88E0-397C2CBCFE41', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:44:11', '2017-01-30 22:53:22'),
('D6964286-8AE7-46AB-8DA5-88A347015C4D', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 22:55:17', '2017-01-30 23:01:57'),
('660B2ED7-B715-4A6C-A92B-D80267C0E4F5', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:06:16', '2017-01-30 23:08:28'),
('903AC539-4BB1-44AB-AFDB-D86C13310011', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:15:21', '2017-01-30 23:16:02'),
('75B88E5F-4795-4A21-9EA6-3B41CE958250', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:28:31', '2017-01-30 23:29:53'),
('DD6A4BD5-EF75-44CE-9309-4C14B2A21FF4', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:45:42', '2017-01-30 23:46:13'),
('518319BA-0EDE-46D8-B0B7-E8CEB233DEDF', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-30 23:54:02', '2017-01-31 00:03:13'),
('FB5A5A54-E580-40F2-94FD-64E20EA5C4DD', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-31 16:13:01', '2017-01-31 16:21:02'),
('8A4FD0C3-59BF-43AB-A829-F2396D6FB26A', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-01-31 18:26:14', '2017-01-31 18:39:20'),
('8CB94AF1-9664-4081-A2E1-271ED16B147B', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-01 08:10:41', '2017-02-01 08:18:18'),
('0DC6B68B-0458-48DF-B286-C1A978653981', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-01 15:40:45', '2017-02-01 15:48:24'),
('DB239857-6870-4AD9-8149-69ED6151CCB2', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 16:54:40', '2017-02-02 17:10:27'),
('938CCFF4-66C9-48B1-BDB7-D9144D2BD522', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 19:29:18', '2017-02-02 19:30:14'),
('1EC730D0-AF85-45BF-BD06-12B23124151F', 'A3E14866-4DD5-4001-AF63-21819F49B401', '2017-02-02 19:43:28', '2017-02-02 19:47:12');
Query
WITH
CTE_Prev
AS
(
SELECT
CaseRef
,PatientRef
,StartDate
,FinishDate
,ISNULL(LAG(StartDate) OVER (PARTITION BY PatientRef ORDER BY StartDate),
'2000-01-01') AS PrevStart
FROM @Table AS T
)
,CTE_Markers
AS
(
SELECT
CaseRef
,PatientRef
,StartDate
,FinishDate
,PrevStart
,CASE WHEN (DATEDIFF(hour, PrevStart, StartDate) >= 12)
THEN 1 ELSE 0 END AS GapIsLargeMarker
FROM CTE_Prev
)
,CTE_Sequences
AS
(
SELECT
CaseRef
,PatientRef
,StartDate
,FinishDate
,PrevStart
,GapIsLargeMarker
,SUM(GapIsLargeMarker) OVER (PARTITION BY PatientRef ORDER BY StartDate
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SeqNumber
FROM CTE_Markers
)
,CTE_Ranks
AS
(
SELECT
CaseRef
,PatientRef
,StartDate
,FinishDate
,PrevStart
,GapIsLargeMarker
,SeqNumber
,ROW_NUMBER() OVER (PARTITION BY PatientRef,SeqNumber ORDER BY StartDate) AS rnk
,COUNT(*) OVER (PARTITION BY PatientRef, SeqNumber) AS SeqLength
FROM CTE_Sequences
)
SELECT
CaseRef
,PatientRef
,StartDate
,FinishDate
,PrevStart
,GapIsLargeMarker
,SeqNumber
,rnk
,SeqLength
FROM CTE_Ranks
WHERE SeqLength > 1
ORDER BY PatientRef, StartDate;
Result
+--------------------------------------+--------------------------------------+-------------------------+-------------------------+-------------------------+------------------+-----------+-----+-----------+
| CaseRef | PatientRef | StartDate | FinishDate | PrevStart | GapIsLargeMarker | SeqNumber | rnk | SeqLength |
+--------------------------------------+--------------------------------------+-------------------------+-------------------------+-------------------------+------------------+-----------+-----+-----------+
| 4BFA4E3B-D313-4777-A290-3C13601D5C95 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 22:00:46.000 | 2017-01-30 22:10:24.000 | 2017-01-30 00:22:41.000 | 1 | 2 | 1 | 9 |
| F910D4DE-3CEE-4429-8844-DDE860D08192 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 22:25:49.000 | 2017-01-30 22:27:58.000 | 2017-01-30 22:00:46.000 | 0 | 2 | 2 | 9 |
| DF28DC91-02E3-47F2-88E0-397C2CBCFE41 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 22:44:11.000 | 2017-01-30 22:53:22.000 | 2017-01-30 22:25:49.000 | 0 | 2 | 3 | 9 |
| D6964286-8AE7-46AB-8DA5-88A347015C4D | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 22:55:17.000 | 2017-01-30 23:01:57.000 | 2017-01-30 22:44:11.000 | 0 | 2 | 4 | 9 |
| 660B2ED7-B715-4A6C-A92B-D80267C0E4F5 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 23:06:16.000 | 2017-01-30 23:08:28.000 | 2017-01-30 22:55:17.000 | 0 | 2 | 5 | 9 |
| 903AC539-4BB1-44AB-AFDB-D86C13310011 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 23:15:21.000 | 2017-01-30 23:16:02.000 | 2017-01-30 23:06:16.000 | 0 | 2 | 6 | 9 |
| 75B88E5F-4795-4A21-9EA6-3B41CE958250 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 23:28:31.000 | 2017-01-30 23:29:53.000 | 2017-01-30 23:15:21.000 | 0 | 2 | 7 | 9 |
| DD6A4BD5-EF75-44CE-9309-4C14B2A21FF4 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 23:45:42.000 | 2017-01-30 23:46:13.000 | 2017-01-30 23:28:31.000 | 0 | 2 | 8 | 9 |
| 518319BA-0EDE-46D8-B0B7-E8CEB233DEDF | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-30 23:54:02.000 | 2017-01-31 00:03:13.000 | 2017-01-30 23:45:42.000 | 0 | 2 | 9 | 9 |
| FB5A5A54-E580-40F2-94FD-64E20EA5C4DD | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-31 16:13:01.000 | 2017-01-31 16:21:02.000 | 2017-01-30 23:54:02.000 | 1 | 3 | 1 | 2 |
| 8A4FD0C3-59BF-43AB-A829-F2396D6FB26A | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-01-31 18:26:14.000 | 2017-01-31 18:39:20.000 | 2017-01-31 16:13:01.000 | 0 | 3 | 2 | 2 |
| 8CB94AF1-9664-4081-A2E1-271ED16B147B | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-02-01 08:10:41.000 | 2017-02-01 08:18:18.000 | 2017-01-31 18:26:14.000 | 1 | 4 | 1 | 2 |
| 0DC6B68B-0458-48DF-B286-C1A978653981 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-02-01 15:40:45.000 | 2017-02-01 15:48:24.000 | 2017-02-01 08:10:41.000 | 0 | 4 | 2 | 2 |
| DB239857-6870-4AD9-8149-69ED6151CCB2 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-02-02 16:54:40.000 | 2017-02-02 17:10:27.000 | 2017-02-01 15:40:45.000 | 1 | 5 | 1 | 3 |
| 938CCFF4-66C9-48B1-BDB7-D9144D2BD522 | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-02-02 19:29:18.000 | 2017-02-02 19:30:14.000 | 2017-02-02 16:54:40.000 | 0 | 5 | 2 | 3 |
| 1EC730D0-AF85-45BF-BD06-12B23124151F | A3E14866-4DD5-4001-AF63-21819F49B401 | 2017-02-02 19:43:28.000 | 2017-02-02 19:47:12.000 | 2017-02-02 19:29:18.000 | 0 | 5 | 3 | 3 |
+--------------------------------------+--------------------------------------+-------------------------+-------------------------+-------------------------+------------------+-----------+-----+-----------+
Run the query step-by-step, cte-by-cte and examine intermediary results to understand how it works.
CTE_Prev
returns PrevStart
from the previous row. If it is the first row for a patient it is NULL
, so I set it to a constant 2001-01-01
.
CTE_Markers
returns GapIsLargeMarker
set to 1 if the gap between two rows is more than 12 hours. It marks with 1
those rows where a new "sequence" starts.
CTE_Sequences
fills the sequence numbers SeqNumber
using running total.
CTE_Ranks
calculates row numbers (rnk
) within each sequence and how many rows (SeqLength
) are in each sequence.
Finally, we return only those sequences that have more than 1 row in them.
回答2:
in general you could do
...WHERE EXISTS(
SELECT 1 FROM dbo.[111Data] as [d2]
WHERE D2.StartDate BETWEEN
D1.StartDate
AND DATEADD(hour, 12, D1.StartDate)
)
to select such a record with a record following it within 12 hours - have you tried anything like that?
I'm not sure what you mean by the second part regarding ranking and counting within the time frame You could probably accomplish it using a sub-query
sorry I made a mistake, you need to exclude the record under scrutiny from the EXISTS
...WHERE EXISTS(
SELECT 1 FROM dbo.[111Data] as [d2]
WHERE D2.StartDate >
D1.StartDate
AND D2.StartDate <= DATEADD(hour, 12, D1.StartDate)
)
来源:https://stackoverflow.com/questions/42133711/how-to-include-only-rows-where-the-following-row-is-within-12-hours-and-rank-acc