Finding an xml node by its name rather than by its index

两盒软妹~` 提交于 2020-01-06 19:44:10

问题


How do I find an xml node by its name and get its value between the tags?

I'm doing that the following way:

from xml.dom import minidom
dom = minidom.parseString(ET.tostring(ET.fromstring(some_xml), "utf-8"))
self.a1 = dom.childNodes[0].childNodes[4].childNodes[0].nodeValue
self.a2 = dom.childNodes[0].childNodes[5].childNodes[0].nodeValue

I want to do that using the name of the tag instead of using its index in an array childNodes. How?

update:

<ReconnectResponse xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://ccc.aaa.bbb/api/v1"> 
  <ErrorMessage /> 
  <ErrorCode>0</ErrorCode> 
  <ServerTime>aaa</ServerTime> 
  <OAuthToken>bbb</OAuthToken> 
  <OAuthTokenSecret>ccc</OAuthTokenSecret> 
</ReconnectResponse>

and the code:

dom.getElementsByTagName("ServerTime") # => []

update2

dom.toxml()
u'<?xml version="1.0" ?><ns0:ReconnectResponse xmlns:ns0="http://ccc.aaa.bbb/api/v1">\n  <ns0:ErrorMessage/>\n  <ns0:ErrorCode>0</ns0:ErrorCode>\n  <ns0:ServerTime>aaa</ns0:ServerTime>\n  <ns0:OAuthToken>bbb</ns0:OAuthToken>\n  <ns0:OAuthTokenSecret>ccc</ns0:OAuthTokenSecret>\n</ns0:ReconnectResponse>'

but how I get the value? I tried this:

dom.getElementsByTagName("ns0:OAuthToken")
[<DOM Element: ns0:OAuthToken at 0x10635a878>]
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0]
<DOM Element: ns0:OAuthToken at 0x10635a878>
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0].nodeValue
(Pdb) dom.getElementsByTagName("ns0:OAuthToken")[0].toxml()
u'<ns0:OAuthToken>aaaaaa</ns0:OAuthToken>'

回答1:


You need to use getElementsByTagNameNS, because you don't have a tag named ServerTime, you have one named {http://ccc.aaa.bbb/api/v1}ServerTime (where {http://ccc.aaa.bbb/api/v1} indicates the default namespace.)

getElementsByTagNameNS("http://ccc.aaa.bbb/api/v1", "ServerTime")

This namespace is implicitly added to every tag in your XML body, due to the last property of the document element:

<ReconnectResponse ... xmlns="http://ccc.aaa.bbb/api/v1">



回答2:


Usually using lxml and xpath is a common approach in Python.

As you want to use minidom explicitly, you can use the following method to get all HTML elements of a particular tag.

matches = dom.getElementsByTagName("foo")
for e in matches:
    print(e.firstChild.nodeValue)


来源:https://stackoverflow.com/questions/26522798/finding-an-xml-node-by-its-name-rather-than-by-its-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!