问题
Im traversing a XML tree and im having some troubles by extracting a node from the tree leaving their inner nodes.
For example:
<xml>
<letter name="B">
<letter name="D">
<letter name="E">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
</letter>
</letter>
</xml>
I need something like this:
<xml>
<letter name="B">
<letter name="D">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
</letter>
</xml>
But i cant get this with out removing all E childs.
Cheers!
回答1:
The idea is to find the letter
element with name="E"
, get it's parent, remove the element from parent and extend the parent with element's children:
import xml.etree.ElementTree as etree
data = """
<xml>
<letter name="B">
<letter name="D">
<letter name="E">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
</letter>
</letter>
</xml>
"""
XPATH = './/letter[@name="E"]'
tree = etree.fromstring(data)
letter = tree.find(XPATH)
parent = tree.find(XPATH + '/..')
parent.remove(letter)
parent.extend(letter)
print etree.tostring(tree)
It prints:
<xml>
<letter name="B">
<letter name="D">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
</letter>
</xml>
UPD (using iterative approach):
def iterparent(tree):
for parent in tree.getiterator():
for child in parent:
yield parent, child
tree = etree.fromstring(data)
for parent, child in iterparent(tree):
if child.tag == "letter" and child.attrib.get('name') == "E":
parent.remove(child)
parent.extend(child)
print etree.tostring(tree)
iterparent()
function is taken from Accessing Parents paragraph from docs.
回答2:
Another thing,
Is possible to do something like this??
Initial XML
<xml>
<letter name="B">
<letter name="D">
<letter name="E">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
<letter name="H">
<letter name="I">
</letter>
</letter>
</letter>
</letter>
</xml>
Then have as the output a list with two trees, something like this:
<xml>
<letter name="B">
<letter name="E">
<letter name="F">
<letter name="G">
</letter>
</letter>
</letter>
</letter>
</xml>
<xml>
<letter name="B">
<letter name="H">
<letter name="I">
</letter>
</letter>
</letter>
</xml>
As you can see @falsetru and @alecxe, i just deleted D and leave only one child per tree.
Thanks!!!!
回答3:
I just finished to do it, i just needed to copy the tree before the deletion, otherwise the original object will be modified..
Here is the solution. By the way!, Thanks a lot!!!! XD
def remove_letter(tree_original, letter):
tree= copy.deepcopy(tree_original)
for parent in tree.getiterator():
for child in parent:
if child.attrib.get('name') == letter:
parent.remove(child)
parent.extend(child)
print etree.tostring(parent)
return parent
def get_next_trees(tree):
my_trees = []
for parent in tree.getiterator():
if child.attrib.get('name') == "D":
for child in parent:
my_trees.append(remove_letter(tree)
return my_trees
来源:https://stackoverflow.com/questions/23498394/remove-a-node-from-etree-but-leaving-child