Python minidom: How to access an element

与世无争的帅哥 提交于 2019-12-25 07:10:03

问题


I'm working on parsing an XML-Sheet in Python. The XML has a structure like this:

<layer1>
    <layer2>
        <element>
            <info1></info1>
        </element>
        <element>
            <info1></info1>
        </element>
        <element>
            <info1></info1>
        </element>
    </layer2>
</layer1>

Without layer2, I have no problems to acess the data in info1. But with layer2, I'm really in trouble. Their I can adress info1 with: root.firstChild.childNodes[0].childNodes[0].data

So my thought was, that I can do it similiar like this:root.firstChild.firstChild.childNodes[0].childNodes[0].data

########## Solution

So this is how I solved my problem: from xml.etree import cElementTree as ET

from xml.etree import cElementTree as ET

tree = ET.parse("test.xml")
root = tree.getroot()

for elem in root.findall('./layer2/'):
    for node in elem.findall('element/'):
        x = node.find('info1').text
        if x != "abc":
            elem.remove(node)

回答1:


Don't use the minidom API if you can help it. Use the ElementTree API instead; the xml.dom.minidom documentation explicitly states that:

Users who are not already proficient with the DOM should consider using the xml.etree.ElementTree module for their XML processing instead.

Here is a short sample that uses the ElementTree API to access your elements:

from xml.etree import ElementTree as ET

tree = ET.parse('inputfile.xml')

for info in tree.findall('.//element/info1'):
    print info.text

This uses an XPath expression to list all info1 elements that are contained inside a element element, regardless of their position in the overall XML document.

If all you need is the first info1 element, use .find():

print tree.find('.//info1').text

With the DOM API, .firstChild could easily be a Text node instead of an Element node; you always need to loop over the .childNotes sequence to find the first Element match:

def findFirstElement(node):
    for child in node.childNodes:
        if child.nodeType == node.ELEMENT_NODE:
            return child

but for your case, perhaps using .getElementsByTagName() suffices:

root.getElementsByTagName('info1').data



回答2:


does this work? (im not amazing at python just a quick thought)

name[0].firstChild.nodeValue


来源:https://stackoverflow.com/questions/16196501/python-minidom-how-to-access-an-element

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!