get a number of unique values without separating values that belong to the same block of values

后端 未结 6 540
伪装坚强ぢ
伪装坚强ぢ 2020-12-19 15:06

I\'m OK with either a PL/SQL solution or an Access VBA/Excel VBA (though Access VBA is preferred over Excel VBA) one. so, PL/SQL is the first choice, Access VBA is second a

6条回答
  •  长情又很酷
    2020-12-19 15:24

    This gets you most of the way there in standard SQL, it's not quite perfect and I expect that the MODEL clause is what would work best, but...

    What this does, is:

    1. In all_possible work out every possible combination
    2. In some_counting pivot this round and count the number of unique otherids per fax. We can also restrict this to 6 here, so that we exclude any faxs which are never going to qualify
    3. In uniquify use row_number() to ensure that we can split records that have the same number of otherids per fax later and also work out the greatest. If this is 6 then you've got a simple win.
    4. In cumulative_sum work out the running sum of the number of otherids per fax. The trick here is the order in which you do it. I've chosen to pick the greatest first and then add in the smaller ones. I'm sure there's a cleverer way to do this... I did this because if the greatest is 6, you win. If it's 4, say, then you can fill it in with 2 faxs which only have 1 associated otherid etc.
    5. Lastly restrict the cumulative sum to 6 records and pull in all the extra data you need.

    Assuming a table as follows, filled with your data:

    create table tmp_table ( 
       r number
     , otherid number
     , fax number
       );
    

    the code would look like this:

    with all_possible as (
    select t.r as t_r, t.otherid as t_otherid, t.fax as t_fax
         , u.r as u_r, u.otherid as u_otherid, u.fax as u_fax
      from tmp_table t
      left outer join tmp_table u
        on t.fax = u.fax
       and t.r <> u.r
           )
    , some_counting as (
     select fax 
          , count(distinct otherid) as no_o_per_fax
       from all_possible
    unpivot ( (r, otherid, fax) 
            for (a, b, c)
             in ( (t_r, t_otherid, t_fax)
                , (u_r, u_otherid, u_fax)
                ))
     group by fax
    having count(distinct otherid) < 6            
            )
    , uniquify as (
    select c.*
         , row_number() over (order by no_o_per_fax asc) as rn
         , max(no_o_per_fax) over () as m_fax
      from some_counting c
           )
    , cumulative_sum as (
    select u.*, sum(no_o_per_fax) over (order by case when no_o_per_fax = m_fax then 0 else 1 end
                                            , no_o_per_fax asc 
                                            , rn ) as csum
      from uniquify u
           )
    , candidates as (
    select a.*
      from cumulative_sum a
     where csum <= 6
           )
    select b.*
      from tmp_table a
      join candidates b
        on a.fax = b.fax
    

    SQL Fiddle

    I make extensive use of common table expressions here to make the code look cleaner

提交回复
热议问题