Why is lxml.etree.iterparse() eating up all my memory?

后端 未结 3 1009
孤街浪徒
孤街浪徒 2020-12-01 06:50

This eventually consumes all my available memory and then the process is killed. I\'ve tried changing the tag from schedule to \'smaller\' tags but that didn\'

3条回答
  •  渐次进展
    2020-12-01 07:22

    This worked really well for me:

    def destroy_tree(tree):
        root = tree.getroot()
    
        node_tracker = {root: [0, None]}
    
        for node in root.iterdescendants():
            parent = node.getparent()
            node_tracker[node] = [node_tracker[parent][0] + 1, parent]
    
        node_tracker = sorted([(depth, parent, child) for child, (depth, parent)
                               in node_tracker.items()], key=lambda x: x[0], reverse=True)
    
        for _, parent, child in node_tracker:
            if parent is None:
                break
            parent.remove(child)
    
        del tree
    

提交回复
热议问题