Python lxml - using the xml:lang attribute to retrieve an element

偶尔善良 提交于 2019-12-06 10:00:34

The xml prefix in xml:lang does not need to be declared in an XML document, but if you want to use xml:lang in XPath lookups, you have to define a prefix mapping in the Python code.

The xml prefix is reserved (as opposed to "normal" namespace prefixes which are arbitrary) and defined to be bound to http://www.w3.org/XML/1998/namespace. See the Namespaces in XML 1.0 W3C recommendation.

Example:

from lxml import etree

# Required mapping
nsmap = {"xml": "http://www.w3.org/XML/1998/namespace"}

XML = """
<root>
  <Title xml:lang="FR" type="main">Les Tudors</Title>
  <Title xml:lang="DE" type="main">Die Tudors</Title>
  <Title xml:lang="IT" type="main">The Tudors</Title>
</root>"""

doc = etree.fromstring(XML)

title_FR = doc.find('Title[@xml:lang="FR"]', namespaces=nsmap)
print title_FR.text

Output:

Les Tudors

If there is no mapping for the xml prefix, you get the "prefix 'xml' not found in prefix map" error. If the URI mapped to the xml prefix is not http://www.w3.org/XML/1998/namespace, the find method in the code snippet above does not return anything.

If you have control over the xml file , you should change the xml:lang attribute to lang .

Or if you do not have that control , I would suggest adding xml in the nsmap, like -

nsmap = {'xmlns': 'urn:tva:metadata:2012',
         'mpeg7': 'urn:tva:mpeg7:2008',
         'xml': '<namespace>'}
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!