Accessing XMLNS attribute with Python Elementree?

匿名 (未验证) 提交于 2019-12-03 01:56:01

问题:

How can one access NS attributes through using ElementTree?

With the following:

When I try to root.get('xmlns') I get back None, Category and Date are fine, Any help appreciated..

回答1:

I think element.tag is what you're looking for. Note that your example is missing a trailing slash, so it's unbalanced and won't parse. I've added one in my example.

>>> from xml.etree import ElementTree as ET >>> data = '''''' >>> element = ET.fromstring(data) >>> element  >>> element.tag '{http://www.foo.net/a}data' >>> element.attrib {'category': 'ABS', 'date': '2009-12-22', 'book': '1'} 

If you just want to know the xmlns URI, you can split it out with a function like:

def tag_uri_and_name(elem):     if elem.tag[0] == "{":         uri, ignore, tag = elem.tag[1:].partition("}")     else:         uri = None         tag = elem.tag     return uri, tag 

For much more on namespaces and qualified names in ElementTree, see effbot's examples.



回答2:

Look at the effbot namespaces documentation/examples; specifically the parse_map function. It shows you how to add an *ns_map* attribute to each element which contains the prefix/URI mapping that applies to that specific element.

However, that adds the ns_map attribute to all the elements. For my needs, I found I wanted a global map of all the namespaces used to make element look up easier and not hardcoded.

Here's what I came up with:

import elementtree.ElementTree as ET  def parse_and_get_ns(file):     events = "start", "start-ns"     root = None     ns = {}     for event, elem in ET.iterparse(file, events):         if event == "start-ns":             if elem[0] in ns and ns[elem[0]] != elem[1]:                 # NOTE: It is perfectly valid to have the same prefix refer                 #     to different URI namespaces in different parts of the                 #     document. This exception serves as a reminder that this                 #     solution is not robust.    Use at your own peril.                 raise KeyError("Duplicate prefix with different URI found.")             ns[elem[0]] = "{%s}" % elem[1]         elif event == "start":             if root is None:                 root = elem     return ET.ElementTree(root), ns 

With this you can parse an xml file and obtain a dict with the namespace mappings. So, if you have an xml file like the following ("my.xml"):

FooJoe McGroinetc...

You will be able to use the xml namepaces and get info for elements like dc:creator:

>>> tree, ns = parse_and_get_ns("my.xml") >>> ns {u'content': '{http://purl.org/rss/1.0/modules/content/}', u'dc': '{http://purl.org/dc/elements/1.1/}'} >>> item = tree.find("/feed/item") >>> item.findtext(ns['dc']+"creator") 'Joe McGroin' 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!