问题
I have a problem that could be easily solved using curser in Oracle. However, I wonder if that could be done using select only. I have 1 data set that contains the following fields: Start, Description, MaximumRow, SequentialOrder.
The data set is ordered by Description, Start, SequentialOrder. This is the data for illustration purpose:

I would like to get the following results in a different data set (Start, End, Description) where Start is the minimum of the "Start" field in a set and End is the maximum of the "Start" field in the set. The set is defined by the following rule: Total number of rows in the new set does not exceed the maximumrow defined in the previous set and all rows in the new set are ordered by SequentialOrder.
Based on the rule above, I have the following sets:

So the results I would like to see based on the illustration is
12-Jun-15, 14-Jun-15, A
01-Jul-15, 01-Jul-15, A
02-Jul-15, 04-Jul-15, A
01-Aug-15, 16-Aug-15, B
If that could be done, please advise. I know we could group by Description but I do not know if we could do further grouping based on MaximumRow and SequentialOrder: As mentioned above, the total row in the subset to be evaluated can not exceed the MaximumRow AND has to be in ordered by SequentialOrder.
I do not think that could be done without using cursor but I ask anyway just in case there is.
I have attached the script to generate the sample above:
CREATE TABLE "TEST"
( "Start" DATE,
"Description" VARCHAR2(20 BYTE),
"MaximunRow" NUMBER,
"SequentialOrder" NUMBER
)
SET DEFINE OFF;
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('12-JUN-15','DD-MON-RR'),'A',3,3);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('13-JUN-15','DD-MON-RR'),'A',3,4);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('14-JUN-15','DD-MON-RR'),'A',3,5);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('01-JUL-15','DD-MON-RR'),'A',3,4);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('02-JUL-15','DD-MON-RR'),'A',3,3);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('04-JUL-15','DD-MON-RR'),'A',3,4);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('01-AUG-15','DD-MON-RR'),'B',2,5);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('16-AUG-15','DD-MON-RR'),'B',2,7);
回答1:
This is a variation of a gaps-and-islands problem, with the added complication of the maximum number of rows in each island. This is a bit long-winded but you could start by identifying the groups caused by the sequence order:
select t.*,
row_number() over (partition by "Description" order by "Start") as rn,
case when lag("SequentialOrder")
over (partition by "Description" order by "Start") < "SequentialOrder"
then 1 else 0 end as newblock
from test t
order by "Start";
Start Description MaximunRow SequentialOrder RN NEWBLOCK
--------- ----------- ---------- --------------- --- ----------
12-JUN-15 A 3 3 1 0
13-JUN-15 A 3 4 2 1
14-JUN-15 A 3 5 3 1
01-JUL-15 A 3 4 4 0
02-JUL-15 A 3 3 5 0
04-JUL-15 A 3 4 6 1
01-AUG-15 B 2 5 1 0
16-AUG-15 B 2 7 2 1
You can then use a recursive CTE (from 11gR2 onwards) based on that:
with u as (
select t.*,
row_number() over (partition by "Description" order by "Start") as rn,
case when lag("SequentialOrder")
over (partition by "Description" order by "Start") < "SequentialOrder"
then 1 else 0 end as newblock
from test t
),
r ("Start", "Description", "MaximunRow", "SequentialOrder", rn, blocknum,
pos, lastmaxrow) as (
select u."Start", u."Description", u."MaximunRow", u."SequentialOrder", u.rn,
1, 1, u."MaximunRow"
from u
where rn = 1
union all
select u."Start", u."Description", u."MaximunRow", u."SequentialOrder", u.rn,
case when r.pos = r.lastmaxrow or u.newblock = 0
then r.blocknum + 1 else r.blocknum end,
case when r.pos = r.lastmaxrow or u.newblock = 0
then 1 else r.pos + 1 end,
case when r.pos = r.lastmaxrow or u.newblock = 0
then r.lastmaxrow else u."MaximunRow" end
from r
join u on u."Description" = r."Description" and u.rn = r.rn + 1
)
select * from r
order by "Start";
Start Description MaximunRow SequentialOrder RN BLOCKNUM POS LASTMAXROW
--------- ----------- ---------- --------------- --- ---------- ---- ----------
12-JUN-15 A 3 3 1 1 1 3
13-JUN-15 A 3 4 2 1 2 3
14-JUN-15 A 3 5 3 1 3 3
01-JUL-15 A 3 4 4 2 1 3
02-JUL-15 A 3 3 5 3 1 3
04-JUL-15 A 3 4 6 3 2 3
01-AUG-15 B 2 5 1 1 1 2
16-AUG-15 B 2 7 2 1 2 2
This is assigning a blocknum
to each row, with that starting at one for each description in the anchor member, and being incremented in the recursive member either if the newblock
is zero (indicating a sequence break) or the number of members in the block is the previous maximum. (I may not have the logic for 'previous maximum' quite right as it isn't clear in the question.)
You can then group by the description and the generated block number:
with u as (
select t.*,
row_number() over (partition by "Description" order by "Start") as rn,
case when lag("SequentialOrder")
over (partition by "Description" order by "Start") < "SequentialOrder"
then 1 else 0 end as newblock
from test t
),
r ("Start", "Description", "MaximunRow", "SequentialOrder", rn, blocknum,
pos, lastmaxrow) as (
select u."Start", u."Description", u."MaximunRow", u."SequentialOrder", u.rn,
1, 1, u."MaximunRow"
from u
where rn = 1
union all
select u."Start", u."Description", u."MaximunRow", u."SequentialOrder", u.rn,
case when r.pos = r.lastmaxrow or u.newblock = 0
then r.blocknum + 1 else r.blocknum end,
case when r.pos = r.lastmaxrow or u.newblock = 0
then 1 else r.pos + 1 end,
case when r.pos = r.lastmaxrow or u.newblock = 0
then r.lastmaxrow else u."MaximunRow" end
from r
join u on u."Description" = r."Description" and u.rn = r.rn + 1
)
select min(r."Start") as "Start", max(r."Start") as "End", r."Description"
from r
group by r."Description", r.blocknum
order by r."Description", r.blocknum;
Start End Description
--------- --------- -----------
12-JUN-15 14-JUN-15 A
01-JUL-15 01-JUL-15 A
02-JUL-15 04-JUL-15 A
01-AUG-15 16-AUG-15 B
Your sample data doesn't trigger the maximum rows break as you don't have any sequences longer than 3 anyway. With some additional data:
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('15-JUN-15','DD-MON-RR'),'A',3,7);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('16-JUN-15','DD-MON-RR'),'A',3,8);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('17-JUN-15','DD-MON-RR'),'A',3,10);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('18-JUN-15','DD-MON-RR'),'A',3,12);
Insert into TEST ("Start","Description","MaximunRow","SequentialOrder") values (to_date('19-JUN-15','DD-MON-RR'),'A',3,13);
the same query gets:
Start End Description
--------- --------- -----------
12-JUN-15 14-JUN-15 A
15-JUN-15 17-JUN-15 A
18-JUN-15 19-JUN-15 A
01-JUL-15 01-JUL-15 A
02-JUL-15 04-JUL-15 A
01-AUG-15 16-AUG-15 B
so you can see it's splitting on sequence change and on hitting three rows in the block.
SQL Fiddle demo.
You could get away with just the recursive CTE, and not the previous intermediate one, by comparing the sequential order directly in the case statements instead of using newblock
; but having rn
to find the next row is easier than trying to find the next date as they aren't contiguous.
来源:https://stackoverflow.com/questions/30803021/gaps-and-islands-solution-in-oracle-use-of-recursive