How can I traverse a tree bottom-up to calculate a (weighted) average of node values in PostgreSQL?

风格不统一 提交于 2019-12-04 12:04:17

Your data can be stored as:

create table tree(id int primary key, parent int, caption text, node_value int);
insert into tree values
(1, 0, 'A', null),
(2, 1, 'B', 6),
(3, 1, 'C', null),
(4, 3, 'D', null),
(5, 4, 'E', 10),
(6, 4, 'F', 2),
(7, 1, 'H', 18),
(8, 7, 'I', 102),
(9, 7, 'J', 301);

The simplest way to do bottom-up aggregation is a recursive function.

create or replace function get_node_value(node_id int)
returns int language plpgsql as $$
declare
    val int;
begin
    select node_value
    from tree 
    where id = node_id
    into val;
    if val isnull then
        select avg(get_node_value(id))
        from tree
        where parent = node_id
        into val;
    end if;
    return val;
end;
$$;

select get_node_value(1);

 get_node_value 
----------------
             10
(1 row)

Test it here.

It is possible to achieve the same in an sql function. The function code is not so obvious but it may be a bit faster than plpgsql.

create or replace function get_node_value_sql(node_id int)
returns int language sql as $$
    select coalesce(
        node_value,
        (
            select avg(get_node_value_sql(id))::int
            from tree
            where parent = node_id
        )
    )
    from tree 
    where id = node_id;
$$;

Viewing a tree from the bottom up using cte is not especially complicated. In this particular case the difficulty lies in the fact that average should be computed for each level separately.

with recursive bottom_up(id, parent, caption, node_value, level, calculated) as (
    select 
        *, 
        0, 
        node_value calculated
    from tree t
    where not exists (
        select id
        from tree
        where parent = t.id)
union all
    select 
        t.*, 
        b.level+ 1,
        case when t.node_value is null then b.calculated else t.node_value end
    from tree t
    join bottom_up b on t.id = b.parent
)

select id, parent, caption, avg(calculated)::int calculated
from (
    select id, parent, caption, level, avg(calculated)::int calculated
    from bottom_up
    group by 1, 2, 3, 4
    ) s
group by 1, 2, 3
order by 1;

Test it here.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!