Flag individuals that share common features with Oracle SQL

﹥>﹥吖頭↗ 提交于 2021-02-07 11:13:41

问题


Consider the following table:

ID  Feature
1   1
1   2
1   3
2   3
2   4
2   6
3   5
3   10
3   12
4   12
4   18
5   10
5   30

I would like to group the individuals based on overlapping features. If two of these groups again have overlapping features, I would consider both as one group. This process should be repeated until there are no overlapping features between groups. The result of this procedure on the table above would be:

ID  Feature Flag
1   1       A
1   2       A
1   3       A
2   3       A
2   4       A
2   6       A
3   5       B
3   10      B
3   12      B
4   12      B
4   18      B
5   10      B
5   30      B

So actually the problem I am trying to solve is finding connected components in a graph. Here [1,2,3] is the graph with ID 1 (see https://en.wikipedia.org/wiki/Connectivity_(graph_theory)). The problem is equivalent to this problem, however I would like to solve it with Oracle SQL.


回答1:


Here is one way to do this, using a hierarchical ("connect by") query. The first step is to extract the initial relationships from the base data; the hierarchical query is built on the result from this first step. I added one more row to the inputs to illustrate a node that is a connected component by itself.

You marked the connected components as A and B - of course, that won't work if you have, say, 30,000 connected components. In my solution, I use the minimum node name as the marker for each connected component.

with
  sample_data (id, feature) as (
    select 1,  1 from dual union all
    select 1,  2 from dual union all
    select 1,  3 from dual union all
    select 2,  3 from dual union all
    select 2,  4 from dual union all
    select 2,  6 from dual union all
    select 3,  5 from dual union all
    select 3, 10 from dual union all
    select 3, 12 from dual union all
    select 4, 12 from dual union all
    select 4, 18 from dual union all
    select 5, 10 from dual union all
    select 5, 30 from dual union all
    select 6, 40 from dual
  )
-- select * from sample_data; /*
, initial_rel(id_base, id_linked) as (
    select distinct s1.id, s2.id
      from sample_data s1 join sample_data s2
                          on s1.feature = s2.feature and s1.id <= s2.id
  )
-- select * from initial_rel; /*
select     id_linked as id, min(connect_by_root(id_base)) as id_group
from       initial_rel
start with id_base <= id_linked
connect by nocycle prior id_linked = id_base and id_base < id_linked
group by   id_linked
order by   id_group, id
;

Output:

     ID   ID_GROUP
------- ----------
      1          1
      2          1
      3          3
      4          3
      5          3
      6          6

Then, if you need to add the ID_GROUP as a FLAG to the base data, you can do so with a trivial join.



来源:https://stackoverflow.com/questions/52629059/flag-individuals-that-share-common-features-with-oracle-sql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!