Postgresql ltree query to find parent with most children; excluding root

廉价感情. 提交于 2019-12-14 02:32:07

问题


I am using PostgreSQL and have a table with a path column that is of type ltree.

The problem I am trying to solve is: given the whole tree structure, what parent has the most children excluding the root.

Sample data looks like this:

path column = ; has a depth of 0 and has 11 children its id is 1824 # dont want this one because its the root
path column = ; has a depth of 0 and has 1 children its id is 1823
path column = 1823; has a depth of 1 and has 1 children its id is 1825
path column = 1823.1825; has a depth of 2 and has 1 children its id is 1826
path column = 1823.1825.1826; has a depth of 3 and has 1 children its id is 1827
path column = 1823.1825.1826.1827; has a depth of 4 and has 1 children its id is 1828
path column = 1824.1925.1955.1959.1972.1991; has a depth of 6 and has 5 children its id is 2001
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2141
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2040
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2054
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2253
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2166
path column = 1824.1925.1955.1959.1972.1991.2001.2054; has a depth of 8 and has 0 children its id is 2205
path column = 1824.1925.1955.1959.1972.1991.2001.2141; has a depth of 8 and has 0 children its id is 2161
path column = 1824.1925.1955.1959.1972.1991.2001.2166; has a depth of 8 and has 1 children its id is 2389
path column = 1824.1925.1955.1959.1972.1991.2001.2166.2389; has a depth of 9 and has 0 children its id is 2402
path column = 1824.1925.1983; has a depth of 3 and has 1 children its id is 2135
path column = 1824.1925.1983.2135; has a depth of 4 and has 0 children its id is 2239
path column = 1824.1926; has a depth of 2 and has 5 children its id is 1942
path column = 1824.1926; has a depth of 2 and has 11 children its id is 1928 # this is the row I am after
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1933
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1989
path column = 1824.1926.1928; has a depth of 3 and has 3 children its id is 2051
path column = 1824.1926.1928; has a depth of 3 and has 0 children its id is 2024
path column = 1824.1926.1928; has a depth of 3 and has 2 children its id is 1988

So, in this example, the row with id 1824 (the root) has 11 children and the row with id 1928 has 11 children with a depth of 2; this is the row I am after.

I am new to ltree and sql for that matter.

(This is a revised question with added sample data after Ltree find parent with most children postgresql was closed).


回答1:


Solution

To find the node with the most children:

SELECT subpath(path, -1, 1), count(*) AS children
FROM   tbl
WHERE  path <> ''
GROUP  BY 1
ORDER  BY 2 DESC
LIMIT  1;

... and exclude root nodes:

SELECT *
FROM  (
   SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
   FROM   tbl
   WHERE  path <> ''
   GROUP  BY 1
   ) ct
LEFT   JOIN (
   SELECT tbl_id
   FROM   tbl
   WHERE  path = ''
   ) x USING  (tbl_id)
WHERE  x.tbl_id IS NULL
ORDER  BY children DESC
LIMIT  1

Assuming that root nodes have an empty ltree ('') as path. Might be NULL. Then use path IS NULL ...

The winner in your example is actually 2001, with 5 children.

-> SQLfiddle

How?

  • Use the function subpath(...) provided by the the additional module ltree.

  • Get the last node in the path with a negative offset, which is the direct parent of the element.

  • Count how often that parent appears, exclude root nodes and take the remaining one with the highest count.

  • Use ltree2text() to extract the value from ltree.

  • If multiple nodes have equally the most children an arbitrary one is picked in the example.

Test case

This is the work I had to do to get to a useful test case (after trimming some noise):

See SQLfiddle.

In other words: please remember to provide a useful test case next time.

Additional columns

Answer to comment.
First, expand the test case:

ALTER TABLE tbl ADD COLUMN postal_code text
              , ADD COLUMN whatever serial;
UPDATE tbl SET postal_code = (1230 + whatever)::text;

Have a look:

SELECT * FROM tbl;

Simply JOIN result to the parent in the base table:

SELECT ct.*, t.postal_code
FROM  (
   SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
   FROM   tbl
   WHERE  path <> ''
   GROUP  BY 1
   ) ct
LEFT   JOIN (
   SELECT tbl_id
   FROM   tbl
   WHERE  path = ''
   ) x USING  (tbl_id)
JOIN  tbl t USING (tbl_id)
WHERE  x.tbl_id IS NULL
ORDER  BY children DESC
LIMIT  1;


来源:https://stackoverflow.com/questions/15601829/postgresql-ltree-query-to-find-parent-with-most-children-excluding-root

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!