Parse indented text tree in Java

后端 未结 4 1352
旧时难觅i
旧时难觅i 2020-12-19 14:38

I have an indented file that I need to parsed using java, I need some way to place this in a Section class as shown below

    root
     root1
       text1
           


        
4条回答
  •  南笙
    南笙 (楼主)
    2020-12-19 14:59

    I implemented an alternative solution, using recursive function calls. As far as I can estimate, it will have worse performance than Max Seo's suggestion, especially on deep hierachies. However it is easier to understand (in my opinion), and hence modify to you specific needs. Have a look and let me know if you have any suggestions.

    One benefit is that it can handle trees with multiple roots, as it is.

    Problem description - just to be clear about it...

    Assuming we have a construct Node, which can contain data and have zero or more children, which are also Nodes. Based on a text input, we want to build trees of nodes, where the data of each node is the content from a line, and the position of the node in tree is indicated by the line position and indentation, so that a line which is indented is the child of the first previous line, which is less indented.

    Algorithm description

    Assuming we have a list of lines, define a function which:

    • If the input list has at least two lines:
      • Removes the first line from the list
      • Removes all lines from the list which satisfy all:
        • Has a higher indent than the first line
        • Occurs before the next line with indent less than or equal to the first line
      • Passes these lines recursively to the function, and sets the result as children of the first line
      • If it has remaining lines, passes these recursively to the function, and combines them with the first line, as the result
      • If there are no more remaining lines, returns a list with the first line as the single element
    • If the input list has one line:
      • Sets the children of that one line to an empty list
      • Returns the list
    • If the input list has no elements
      • Returns an empty list
    • Remove the first line from the list

    Calling the function with a list of lines, will result in a list of trees, based on their indent. If the tree has only one root, the resulting tree will be the first element of the result list.

    Pseudocode

    List LinesToTree( List lines )
    {
        if(lines.count >= 2)
        {
            firstLine = lines.shift
            nextLine = lines[0]
            children = List
    
            while(nextLine != null && firstLine.indent < nextLine.indent)
            {
                children.add(lines.shift)
                nextLine = lines[0]
            }
    
            firstLineNode = new Node
            firstLineNode.data = firstLine.data
            firstLineNode.children = LinesToTree(children)
    
            resultNodes = new List
            resultNodes.add(firstLineNode)
    
            if(lines.count > 0)
            {
                siblingNodes = LinesToTree(lines)
                resultNodes.addAll(siblingNodes)
                return resultNodes
            }
            else
            {
                return resultNodes
            }
        }
        elseif()
        {
            nodes = new List
            node = new Node
            node.data = lines[0].data
            node.children = new List
            return nodes
        }
        else
        {
            return new List
        }
    }
    

    PHP implementation using arrays

    The implementation is customisable through the delegate to get the indent, and the name of the children field in output array.

    public static function IndentedLinesToTreeArray(array $lineArrays, callable $getIndent = null, $childrenFieldName = "children")
    {
        //Default function to get element indentation
        if($getIndent == null){
            $getIndent = function($line){
                return $line["indent"];
            };
        }
    
        $lineCount = count($lineArrays);
    
        if($lineCount >= 2)
        {
            $firstLine = array_shift($lineArrays);
            $children = [];
            $nextLine = $lineArrays[0];
    
            while($getIndent($firstLine) < $getIndent($nextLine)){
                $children[] = array_shift($lineArrays);
                if(!isset($lineArrays[0])){
                    break;
                }
                $nextLine = $lineArrays[0];
            }
    
            $firstLine[$childrenFieldName] = self::IndentedLinesToTreeArray($children, $getIndent, $childrenFieldName);
    
            if(count($lineArrays)){
                return array_merge([$firstLine],self::IndentedLinesToTreeArray($lineArrays, $getIndent, $childrenFieldName));
            }else{
                return [$firstLine];
            }
        }
        elseif($lineCount == 1)
        {
            $lineArrays[0][$childrenFieldName] = [];
            return $lineArrays;
        }
        else
        {
            return [];
        }
    }
    

提交回复
热议问题