Postgres - Convert adjacency list to nested JSON object

后端 未结 1 934
栀梦
栀梦 2020-12-19 18:06

I have a table with this data in Postgres and I am having a hard time to convert this in to a JSON object.

node_id    parent_node    name
-------    --------         


        
相关标签:
1条回答
  • 2020-12-19 18:53

    Using WITH RECURSIVE (https://www.postgresql.org/docs/current/static/queries-with.html) and JSON Functions (https://www.postgresql.org/docs/current/static/functions-json.html) I build this solution:

    db<>fiddle

    The core functionality:

        WITH RECURSIVE tree(node_id, ancestor, child, path, json) AS  (
          SELECT 
              t1.node_id, 
              NULL::int, 
              t2.node_id,
              '{children}'::text[] || 
                 (row_number() OVER (PARTITION BY t1.node_id ORDER BY t2.node_id) - 1)::text,-- C
              jsonb_build_object('name', t2.name, 'children', array_to_json(ARRAY[]::int[])) -- B
          FROM test t1
          LEFT JOIN test t2 ON t1.node_id = t2.parent_node                                   -- A
          WHERE t1.parent_node IS NULL
    
          UNION
    
          SELECT
              t1.node_id, 
              t1.parent_node, 
              t2.node_id,
              tree.path || '{children}' || (row_number() OVER (PARTITION BY t1.node_id ORDER BY t2.node_id) - 1)::text, 
              jsonb_build_object('name', t2.name, 'children', array_to_json(ARRAY[]::int[]))
          FROM test t1
          LEFT JOIN test t2 ON t1.node_id = t2.parent_node
          INNER JOIN tree ON (t1.node_id = tree.child)
          WHERE t1.parent_node = tree.node_id                                                -- D
        )
        SELECT                                                                               -- E
            child as node_id, path, json 
        FROM tree 
        WHERE child IS NOT NULL ORDER BY path
    

    Every WITH RECURSIVE contains a start SELECT and a recursion part (the second SELECT) combined by a UNION.

    A: Joining the table agains itself for finding the children of a node_id.

    B: Building the json object for the child which can be inserted into its parent

    C: Building the path where the child object has to be inserted (from root). The window function row_number() (https://www.postgresql.org/docs/current/static/tutorial-window.html) generates the index of the child within the children array of the parent.

    D: The recursion part works as the initial part with one difference: It's not searching for the root element but for the element which has the parent node of the last recursion.

    E: Executing the recursion and filtering all elements without any children gives this result:

    node_id   path                      json
    2         children,0                {"name": "node2", "children": []}
    4         children,0,children,0     {"name": "node4", "children": []}
    5         children,0,children,1     {"name": "node5", "children": []}
    6         children,0,children,2     {"name": "node6", "children": []}
    3         children,1                {"name": "node3", "children": []}
    7         children,1,children,0     {"name": "node7", "children": []}
    8         children,1,children,1     {"name": "node8", "children": []}
    

    Though I found no way to add all children elements in the recursion (the origin json is no global variable; so it always knows the changes of the direct ancestors, not their siblings), I had to iterate the rows in a seconds step.

    That's why I build the function. In there I can do the iteration for a global variable. With the function jsonb_insert I am inserting all calculated elements into a root json object - using the calculated path.

    CREATE OR REPLACE FUNCTION json_tree() RETURNS jsonb AS $$
    DECLARE
        _json_output jsonb;
        _temprow record;
    BEGIN
        SELECT 
            jsonb_build_object('name', name, 'children', array_to_json(ARRAY[]::int[])) 
        INTO _json_output 
        FROM test 
        WHERE parent_node IS NULL;
    
        FOR _temprow IN
            /* Query above */
        LOOP
            SELECT jsonb_insert(_json_output, _temprow.path, _temprow.json) INTO _json_output;
        END LOOP;
    
        RETURN _json_output;
    END;
    $$ LANGUAGE plpgsql;
    

    Last step is calling the function and make the JSON more readable (jsonb_pretty())

    {
        "name": "node1",
        "children": [{
            "name": "node2",
            "children": [{
                "name": "node4",
                "children": []
            },
            {
                "name": "node5",
                "children": []
            },
            {
                "name": "node6",
                "children": []
            }]
        },
        {
            "name": "node3",
            "children": [{
                "name": "node7",
                "children": []
            },
            {
                "name": "node8",
                "children": []
            }]
        }]
    }
    

    I am sure it is possible to optimize the query but for a sketch it works.

    0 讨论(0)
提交回复
热议问题