If I have an xml file like this:
<root>
<item>
<prop>something</prop>
</item>
<test>
<prop>something</prop>
</test>
<test2>
<prop>something</prop>
</test2>
</root>
I can use
xmlTree.getroot().findall("item")
to get all of the 'item' elements.
How would I get all of the 'item' OR 'test' elements? I want something like:
xmlTree.getroot().findall("item or test")
I didn't see anything like this in the examples in the documentation. Any ideas?
Since ElementTree from stdlib provides only limited xpath support, you can use |
xpath OR operator only if you are using lxml
:
from lxml import etree as ET
data = """<?xml version="1.0"?>
<data>
<item>1</item>
<test>2</test>
</data>"""
tree = ET.fromstring(data)
for element in tree.xpath('//item|//test'):
print element.text
prints:
1
2
In case of xml.etree.ElementTree
you can combine the results of two separate findall()
calls:
for element in tree.findall('.//item') + tree.findall('.//test'):
print element.text
Or, check the tag name inside the loop:
for element in tree.iter():
if element.tag in ('item', 'test'):
print element.text
A "wild-card" solution for large data-set
Here is a solution where you do not need to specify "A | B| ...". Instead use "*" as a wild card, and filter out unwanted parts by index as shown below in the code (for example, in this question the last tag "test2" can be excluded by using lst[:-1]).
import xml.etree.ElementTree as ET
data='''
<root>
<item>
<prop>something1</prop>
</item>
<test>
<prop>something2</prop>
</test>
<test2>
<prop>something3</prop>
</test2>
</root>'''
root = ET.fromstring(data)
lst = root.findall('*')
for x in lst[:-1]:
print(x.find('prop').text)
OUTPUT:
something1
something2
来源:https://stackoverflow.com/questions/22560862/elementtree-findall-or-operator