How to retrieve xsi:noNamespaceSchemaLocation from XML with lxml?

断了今生、忘了曾经 提交于 2020-04-17 22:12:50

问题


I am trying to validate XML based on xsi:noNamespaceSchemaLocation.

I researched this question but it doesn't seem any available solutions for it.

My XML file looks this way:

<shiporder orderid="889923"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="shiporder.xsd">
  <orderperson>John Smith</orderperson>
  <shipto>
    <name>Ola Nordmann</name>
    <address>Langgt 23</address>
    <city>4000 Stavanger</city>
    <country>Norway</country>
  </shipto>
  <item>
    <title>Empire Burlesque</title>
    <note>Special Edition</note>
    <quantity>1</quantity>
    <price>10.90</price>
  </item>
  <item>
    <title>Hide your heart</title>
    <quantity>1</quantity>
    <price>9.90</price>
  </item>
</shiporder>

I took it from w3school

This is what I get when parse and take attrib from root {'{http://www.w3.org/2001/XMLSchema-instance}noNamespaceSchemaLocation': 'shiporder.xsd'}

How can I do it with lxml in Python? I looked on other parsers but so far no idea how to do it.


回答1:


Thanks to @mzjn for pointing out about Clark notation.

The solution I came up with is:

from lxml import etree

...

it = etree.fromstring(xml)
# We need to go through all keys since they can be in
# Clark notation and have URL with brackets as a prefix
for attr in it.attrib:
    if 'noNamespaceSchemaLocation' in attr:
        xsd = it.attrib.get(attr)
        break

...

# Do validations based on XSD URL value


来源:https://stackoverflow.com/questions/60448596/how-to-retrieve-xsinonamespaceschemalocation-from-xml-with-lxml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!