Recursive Insert using connect by clause

冷暖自知 提交于 2019-12-04 07:51:21

If all leaf nodes are at the same height (here lvl=4), you can write a simple CONNECT BY query with a ROLLUP:

SQL> SELECT lvl0,
  2         regexp_substr(path, '[^/]+', 1, 2) lvl1,
  3         regexp_substr(path, '[^/]+', 1, 3) lvl2,
  4         SUM(VALUE) sum_value
  5    FROM (SELECT sys_connect_by_path(t.element, '/') path,
  6                 connect_by_root(t.element) lvl0,
  7                 t.element, d.VALUE, LEVEL lvl
  8             FROM tree t
  9             LEFT JOIN DATA d ON d.element = t.element
 10            START WITH t.PARENT IS NULL
 11           CONNECT BY t.PARENT = PRIOR t.element)
 12   WHERE VALUE IS NOT NULL
 13     AND lvl = 4
 14   GROUP BY lvl0, ROLLUP(regexp_substr(path, '[^/]+', 1, 2),
 15                         regexp_substr(path, '[^/]+', 1, 3));

LVL0 LVL1  LVL2   SUM_VALUE
---- ----- ----- ----------
P0   P1    P11            6
P0   P1    P12            6
P0   P1                  12
P0   P2    P21            6
P0   P2    P22            6
P0   P2                  12
P0                       24

The insert would look like:

INSERT INTO data (element, value) 
(SELECT coalesce(lvl2, lvl1, lvl0), sum_value
   FROM <query> d_out
  WHERE NOT EXISTS (SELECT NULL
                      FROM data d_in
                     WHERE d_in.element = coalesce(lvl2, lvl1, lvl0)));

If the height of the leaf nodes is unknown/unbounded this gets more hairy. The above approach wouldn't work since ROLLUP needs to know exactly how many columns are to be considered.

In that case, you could use the tree structure in a self-join :

SQL> WITH HIERARCHY AS (
  2     SELECT t.element, path, VALUE
  3       FROM (SELECT sys_connect_by_path(t.element, '/') path,
  4                    connect_by_isleaf is_leaf, ELEMENT
  5                FROM tree t
  6               START WITH t.PARENT IS NULL
  7              CONNECT BY t.PARENT = PRIOR t.element) t
  8       LEFT JOIN DATA d ON d.element = t.element
  9                       AND t.is_leaf = 1
 10  )
 11  SELECT h.element, SUM(elements.value)
 12    FROM HIERARCHY h
 13    JOIN HIERARCHY elements ON elements.path LIKE h.path||'/%'
 14   WHERE h.VALUE IS NULL
 15   GROUP BY h.element
 16   ORDER BY 1;

ELEMENT SUM(ELEMENTS.VALUE)
------- -------------------
P0                       24
P1                       12
P11                       6
P12                       6
P2                       12
P21                       6
P22                       6

Here is another option using the SQL MODEL clause. I've taken some hints from what Vincent has done in his answer (use of regexp_subsr) to simplify my code.

The first part, within the WITH clause just rejigs the data and extracts out the hierarchy at each level.

The model clause, at the end of the query, brings the data up from the lowest levels. This will need additional columns added if there are more than four levels but should work no matter at what level the values are held.

I'm not entirely sure that this will work in all circumstances since I'm not that experienced with the MODEL clause but it does at least seem to work in this case.

with my_hierarchy_data as (
select 
    element,
    value, 
    path, 
    parent,
    lvl0,
    regexp_substr(path, '[^/]+', 1, 2) as lvl1,
    regexp_substr(path, '[^/]+', 1, 3) as lvl2,
    regexp_substr(path, '[^/]+', 1, 4) as lvl3
from ( 
  select 
    element,
    value, 
    parent,
    sys_connect_by_path(element, '/') as path, 
    connect_by_root element as lvl0
  from 
    tree
    left outer join data using (element)
  start with parent is null
  connect by prior element = parent
  order siblings by element
  )
)
select 
    element,
    value, 
    path, 
    parent,
    new_value,
    lvl0, 
    lvl1, 
    lvl2, 
    lvl3
from my_hierarchy_data
model
return all rows
partition by (lvl0)
dimension by (lvl1, lvl2, lvl3)
measures(element, parent, value, value as new_value, path)
rules sequential order (
    new_value[lvl1, lvl2, null] = sum(value)[cv(lvl1), cv(lvl2), lvl3 is not null],
    new_value[lvl1, null, null] = sum(new_value)[cv(lvl1), lvl2 is not null, null],
    new_value[null, null, null] = sum(new_value)[lvl1 is not null, null, null]
)

The insert statement you can use is

INSERT INTO data (elelment, value)
select element, newvalue
from <the_query>
where value is null;
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!