Note: with help from RhodiumToad on #postgresql, I\'ve arrived at a solution, which I posted as answer. If anyone can improve on this, please chime in!
I have not be
An alternative approach would be to traverse the graph in reversed order:
WITH RECURSIVE cte AS (
SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path
FROM node_relations r
LEFT JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id
WHERE r0.ancestor_node_id IS NULL -- start at the end
UNION ALL
SELECT r.ancestor_node_id || c.path
FROM cte c
JOIN node_relations r ON r.descendant_node_id = c.path[1]
)
SELECT path
FROM cte
ORDER BY path;
This produces a subset with every path from each root node to its ultimate descendant. For deep trees that also spread out a lot this would entail much fewer join operations. To additionally add every sub-path, you could append a LATERAL join to the outer SELECT:
WITH RECURSIVE cte AS (
SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path
FROM node_relations r
LEFT JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id
WHERE r0.ancestor_node_id IS NULL -- start at the end
UNION ALL
SELECT r.ancestor_node_id || c.path
FROM cte c
JOIN node_relations r ON r.descendant_node_id = c.path[1]
)
SELECT l.path
FROM cte, LATERAL (
SELECT path[1:g] AS path
FROM generate_series(2, array_length(path,1)) g
) l
ORDER BY l.path;
I ran a quick test, but it didn't run faster than RhodiumToad's solution. It might still be faster for big or wide tables. Try with your data.