Return result from arbitrarily nested xml tree sum

问题

I have the following code that recurses(?) over an xml tree, which represents a simple equation:

root = etree.XML(request.data['expression'])

def addleafnodes(root):
    numbers = []
    for child in root:
        if root.tag != "root" and root.tag != "expression":
            print(root.tag, child.text)

            if child.tag != "add" and child.tag != "multiply":
                numbers.append(int(child.text))
                print("NUMBERS", numbers)
            elif child.tag == "add":
                numbers.append(np.sum(addleafnodes(child)))
                print("NUMBERS", numbers)
            elif child.tag == "multiply":
                numbers.append(np.prod(addleafnodes(child)))
                print("NUMBERS", numbers)
        print("NUMBERS", numbers)
        addleafnodes(child)
    return numbers

newresults = addleafnodes(root)
print("[NEW RESULTS]", newresults)

The xml is:

<root>
    <expression>
        <add>
            <add>
                <number>1</number>
                <number>2</number>
            </add>
            <multiply>
                <number>2</number>
                <number>3</number>
            </multiply>
            <add>
                <number>4</number>
                <number>5</number>
            </add>
            <number>3</number>
            <multiply>
                <number>1</number>
                <add>
                    <number>3</number>
                    <number>4</number>
                </add>
            </multiply>
        </add>
    </expression>
</root>

The code seems to work right up until the last loop, when it resets the numbers list and seems to start the process again, abortively.

How do I tell python (lxml) to stop when it has looked at every node? I've probably missed something important!

回答1:

First of all, I think you can make it easier for yourself by asserting that a tag is something, rather than it not being something (e.g. try to remove != and replace with ==).

One problem was the line addleafnodes(child) which returned something which then got thrown away. As you can get a list of numbers returned, which should be added/multiplied/etc., you can add these to the numbers list with numbers.extend(somelist). It is a bit hard to explain recursions, so perhaps if you take a look at the code it will make more sense. What I do sometimes, is add a depth variable to the function and increment it everytime I "recurse" - this way, when printing information, it may be easier to see which "level" a number is returned from and to where.

def addleafnodes(root):
    numbers = []
    for child in root:
        if child.tag == "number":
            numbers.append(int(child.text))
        elif child.tag == "add":
            numbers.append(np.sum(addleafnodes(child)))
        elif child.tag == "multiply":
            numbers.append(np.prod(addleafnodes(child)))
        else:
            numbers.extend(addleafnodes(child))
        print("NUMBERS: ", numbers)
    return numbers

newresults = addleafnodes(root)
print("[NEW RESULTS]", newresults)

# outputs:
NUMBERS:  [1]
NUMBERS:  [1, 2]
NUMBERS:  [3]
NUMBERS:  [2]
NUMBERS:  [2, 3]
NUMBERS:  [3, 6]
NUMBERS:  [4]
NUMBERS:  [4, 5]
NUMBERS:  [3, 6, 9]
NUMBERS:  [3, 6, 9, 3]
NUMBERS:  [1]
NUMBERS:  [3]
NUMBERS:  [3, 4]
NUMBERS:  [1, 7]
NUMBERS:  [3, 6, 9, 3, 7]
NUMBERS:  [28]
NUMBERS:  [28]
[NEW RESULTS] [28]

Another thing: you've chosen to allow lists of numbers in an <add></add>. You could also consider it having simply 2 numbers, since it is a binary operation, and then rely on nesting. Same obviously applies for other unary/binary/ternary/.. operators.

<add>
    <number>1</number>
    <add>
        <number>2</number>
        <number>3</number>
    </add>
</add>

That way, maybe you can eliminate the for-loop, but I'm not sure if it creates other problems. :-)

来源：https://stackoverflow.com/questions/56446862/return-result-from-arbitrarily-nested-xml-tree-sum

标签

python-3.x

recursion

lxml