问题
I have a subset of records that look like this:
ID DATE
A 2015-09-01
A 2015-10-03
A 2015-10-10
B 2015-09-01
B 2015-09-10
B 2015-10-03
...
For each ID the first minimum date is the first index record. Now I need to exclude cases within 30 days of the index record, and any record with a date greater than 30 days becomes another index record.
For example, for ID A, 2015-09-01 and 2015-10-03 are both index records and would be retained since they are more than 30 days apart. 2015-10-10 would be dropped because it's within 30 days of the 2nd index case.
For ID B, 2015-09-10 would be dropped and would NOT be an index case because it's within 30 days of the 1st index record. 2015-10-03 would be retained because it's greater than 30 days of the 1st index record and would be considered the 2nd index case.
The output should look like this:
ID DATE
A 2015-09-01
A 2015-10-03
B 2015-09-01
B 2015-10-03
How do I do this in SQL server 2012? There's no limit to how many dates an ID can have, could be just 1 to as many as 5 or more. I'm fairly basic with SQL so any help would be greatly appreciated.
回答1:
working like in your example, #test is your table with data:
;with cte1
as
(
select
ID, Date,
row_number()over(partition by ID order by Date) groupID
from #test
),
cte2
as
(
select ID, Date, Date as DateTmp, groupID, 1 as getRow from cte1 where groupID=1
union all
select
c1.ID,
c1.Date,
case when datediff(Day, c2.DateTmp, c1.Date) > 30 then c1.Date else c2.DateTmp end as DateTmp,
c1.groupID,
case when datediff(Day, c2.DateTmp, c1.Date) > 30 then 1 else 0 end as getRow
from cte1 c1
inner join cte2 c2 on c2.groupID+1=c1.groupID and c2.ID=c1.ID
)
select ID, Date from cte2 where getRow=1 order by ID, Date
回答2:
select * from
(
select ID,DATE_, case when DATE_DIFF is null then 1 when date_diff>30 then 1 else 0 end comparison from
(
select ID, DATE_ ,DATE_-LAG(DATE_, 1) OVER (PARTITION BY ID ORDER BY DATE_) date_diff from trial
)
)
where comparison=1 order by ID,DATE_;
Tried in Oracle Database. Similar funtions exist in SQL Server too.
I am grouping by Id column, and based on DATE field, am comparing the date in current field with its previous field. The very first row of a given user id would return null, and first field is required in our output as first index. For all other fields, we return 1 when the date difference with respect to previous field is greater than 30.
Lag function in transact sql
Case function in transact sql
回答3:
Your logic explained in question is wrong,at one place ,you have said take the first index record and in next place you considered immediate record..
This works for immediate records:
with cte
as
(
select *, ROW_NUMBER() over (partition by id order by datee) as rownum
from #test
)
select *,datediff(day,beforedate,datee)
from cte t1
cross apply
(Select isnull(max(Datee),t1.datee) as beforedate from cte t2 where t1.id =t2.id and t2.rownum<t1.rownum) b
where datediff(day,beforedate,datee)= 0 or datediff(day,beforedate,datee)>=30
This works for constant base record:
select *,datediff(day,basedate,datee) from #test t1
cross apply
(select min(Datee) as basedate from #test t2 where t1.id=t2.id)b
where datediff(day,basedate,datee)>=30 or datediff(day,basedate,datee)=0
回答4:
Try this solution.
Sample demo
with diffs as (
select t1.id,t1.dt strtdt,t2.dt enddt,datediff(dd,t1.dt,t2.dt) daysdiff
from t t1
join t t2 on t1.id=t2.id and t1.dt<t2.dt
)
, y as (
select id,strtdt,enddt
from (
select id,strtdt,enddt,row_number() over(partition by id,strtdt order by daysdiff) as rn
from diffs
where daysdiff > 30
) x
where rn=1
)
,z as (
select *,coalesce(lag(enddt) over(partition by id order by strtdt),strtdt) prevend
from y)
select id,strtdt from z where strtdt=prevend
union
select id,enddt from z where strtdt=prevend
来源:https://stackoverflow.com/questions/36805651/how-to-get-next-minimum-date-that-is-not-within-30-days-and-use-as-reference-poi