T Table

EMPLID  CODE    DT
101     PPP     01-JAN-15
101     PPP     02-JAN-15
101     PPP     03-JAN-15
101     OOO     04-JAN-15
101     OOO     05-JAN-15
101     PPP     06-JAN-15
101     PPP     07-JAN-15
101     PPP     08-JAN-15
101     PPP     09-JAN-15

what I want is a result like the following:

EMPLID  CODE    RNK   DT
101     PPP     1     01-JAN-15
101     PPP     1     02-JAN-15
101     PPP     1     03-JAN-15
101     OOO     2     04-JAN-15
101     OOO     2     05-JAN-15
101     PPP     3     06-JAN-15
101     PPP     3     07-JAN-15
101     PPP     3     08-JAN-15
101     PPP     3     09-JAN-15

but I only seem to be getting results like the following:

SELECT EMPLID, CODE, DENSE_RANK() OVER(ORDER BY CODE) AS RNK, DT 
FROM T;


EMPLID  CODE    RNK   DT
101     OOO     1     05-JAN-15
101     OOO     1     04-JAN-15
101     PPP     2     03-JAN-15
101     PPP     2     08-JAN-15
101     PPP     2     09-JAN-15
101     PPP     2     06-JAN-15
101     PPP     2     07-JAN-15
101     PPP     2     02-JAN-15
101     PPP     2     01-JAN-15

Since the PPP values repeat after the OOO values but at later dates, I'm not sure how to group/order it so that I get the RNK field to consider the second set of PPP values as a third group.

Here is the code for the table to make this a little easier:

WITH
T AS(
SELECT '101' AS EMPLID, 'PPP' AS CODE, '01-JAN-15' AS DT FROM DUAL
UNION ALL
SELECT '101', 'PPP', '02-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'PPP', '03-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'OOO', '04-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'OOO', '05-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'PPP', '06-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'PPP', '07-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'PPP', '08-JAN-15' FROM DUAL
UNION ALL
SELECT '101', 'PPP', '09-JAN-15' FROM DUAL
)
SELECT * 
FROM T;

回答1:

I think you need two passes: First find all rows where the code changes, then build a group on this.

select emplid, code, sum(change) over (order by dt) as rnk, dt 
from
(
  select 
    emplid,
    code,
    case when lag(code) over (order by dt) = code then 0 else 1 end as change,
    dt
  from t
  order by dt
);

回答2:

If I understood correctly you want to order the results by dt. You can set the flag for every changing row and then sum the flags up to that row, which will give you the same result as DENSE_RANK:

SELECT 
    sel2.emplid,
    sel2.code, 
    SUM (sel2.flg) OVER (ORDER BY sel2.dt ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING) rnk,
    sel2.dt
FROM
  ( SELECT
        sel1.emplid,
        sel1.code,
        CASE WHEN sel1.code = sel1.code_lag THEN 0 ELSE 1 END flg,
        sel1.dt
    FROM     
      ( SELECT emplid, code, dt, LAG(code,1,'XXX') OVER (ORDER BY dt) code_lag
        FROM T 
      ) sel1
  ) sel2
ORDER BY dt;

The only thing you should be aware of is that the last result will be treated as a separate rank, regardless of weather it actually is one or not.

回答3:

Recursive solution is also possible:

with s (u, s, a, n) as (
  select emplid, code, 1, dt from t where dt = date '2015-01-01'
  union all select emplid, code, a+decode(code, s, 0, 1), dt from t join s on dt=n+1 ) 
select * from s

SQLFiddle

来源：https://stackoverflow.com/questions/29849314/how-to-dense-rank-sets-of-values-that-repeat

标签

sql