How does XPath deal with XML namespaces?

China☆狼群 提交于 2019-11-25 23:56:27

问题


How does XPath deal with XML namespaces?

If I use

/IntuitResponse/QueryResponse/Bill/Id

to parse the XML document below I get 0 nodes back.

<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<IntuitResponse xmlns=\"http://schema.intuit.com/finance/v3\" 
                time=\"2016-10-14T10:48:39.109-07:00\">
    <QueryResponse startPosition=\"1\" maxResults=\"79\" totalCount=\"79\">
        <Bill domain=\"QBO\" sparse=\"false\">
            <Id>=1</Id>
        </Bill>
    </QueryResponse>
</IntuitResponse>

However, I\'m not specifying the namespace in the XPath (i.e. http://schema.intuit.com/finance/v3 is not a prefix of each token of the path). How can XPath know which Id I want if I don\'t tell it explicitly? I suppose in this case (since there is only one namespace) XPath could get away with ignoring the xmlns entirely. But if there are multiple namespaces, things could get ugly.


回答1:


Defining namespaces in XPath (recommended)

XPath itself doesn't have a way to bind a namespace prefix with a namespace. Such facilities are provided by the hosting library.

It is recommended that you use those facilities and define namespace prefixes that can then be used to qualify XML element and attribute names as necessary.


Here are some of the various mechanisms which XPath hosts provide for specifying namespace prefix bindings to namespace URIs:

XSLT:

<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:i="http://schema.intuit.com/finance/v3">
   ...

Perl (LibXML):

my $xc = XML::LibXML::XPathContext->new($doc);
$xc->registerNs('i', 'http://schema.intuit.com/finance/v3');
my @nodes = $xc->findnodes('/i:IntuitResponse/i:QueryResponse');

Python (lxml):

from lxml import etree
f = StringIO('<IntuitResponse>...</IntuitResponse>')
doc = etree.parse(f)
r = doc.xpath('/i:IntuitResponse/i:QueryResponse', 
              namespaces={'i':'http://schema.intuit.com/finance/v3'})

Python (ElementTree):

namespaces = {'i': 'http://schema.intuit.com/finance/v3'}
root.findall('/i:IntuitResponse/i:QueryResponse', namespaces)

Java (SAX):

NamespaceSupport support = new NamespaceSupport();
support.pushContext();
support.declarePrefix("i", "http://schema.intuit.com/finance/v3");

Java (XPath):

xpath.setNamespaceContext(new NamespaceContext() {
    public String getNamespaceURI(String prefix) {
      switch (prefix) {
        case "i": return "http://schema.intuit.com/finance/v3";
        // ...
       }
    });
  • Remember to call DocumentBuilderFactory.setNamespaceAware(true).
  • See also: Java XPath: Queries with default namespace xmlns

xmlstarlet:

-N i="http://schema.intuit.com/finance/v3"

JavaScript:

See Implementing a User Defined Namespace Resolver:

function nsResolver(prefix) {
  var ns = {
    'i' : 'http://schema.intuit.com/finance/v3'
  };
  return ns[prefix] || null;
}
document.evaluate( '/i:IntuitResponse/i:QueryResponse', 
                   document, nsResolver, XPathResult.ANY_TYPE, 
                   null );

PhP:

Adapted from @Tomalak's answer using DOMDocument:

$result = new DOMDocument();
$result->loadXML($xml);

$xpath = new DOMXpath($result);
$xpath->registerNamespace("i", "http://schema.intuit.com/finance/v3");

$result = $xpath->query("/i:IntuitResponse/i:QueryResponse");

See also @IMSoP's canonical Q/A on PHP SimpleXML namespaces.

C#:

XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
XmlNodeList nodes = el.SelectNodes(@"/i:IntuitResponse/i:QueryResponse", nsmgr);

VBA:

xmlNS = "xmlns:i='http://schema.intuit.com/finance/v3'"
doc.setProperty "SelectionNamespaces", xmlNS  
Set queryResponseElement =doc.SelectSingleNode("/i:IntuitResponse/i:QueryResponse")

VB.NET:

xmlDoc = New XmlDocument()
xmlDoc.Load("file.xml")
nsmgr = New XmlNamespaceManager(New XmlNameTable())
nsmgr.AddNamespace("i", "http://schema.intuit.com/finance/v3");
nodes = xmlDoc.DocumentElement.SelectNodes("/i:IntuitResponse/i:QueryResponse",
                                           nsmgr)

Ruby (Nokogiri):

puts doc.xpath('/i:IntuitResponse/i:QueryResponse',
                'i' => "http://schema.intuit.com/finance/v3")

Note that Nokogiri supports removal of namespaces,

doc.remove_namespaces!

but see the below warnings discouraging the defeating of XML namespaces.


Once you've declared a namespace prefix, your XPath can be written to use it:

/i:IntuitResponse/i:QueryResponse

Defeating namespaces in XPath (not recommended)

An alternative is to write predicates that test against local-name():

/*[local-name()='IntuitResponse']/*[local-name()='QueryResponse']/@startPosition

Or, in XPath 2.0:

/*:IntuitResponse/*:QueryResponse/@startPosition

Skirting namespaces in this manner works but is not recommended because it

  • Under-specifies the full element/attribute name.
  • Fails to differentiate between element/attribute names in different namespaces (the very purpose of namespaces). Note that this concern could be addressed by adding an additional predicate to check the namespace URI explicitly1:

    /*[    namespace-uri()='http://schema.intuit.com/finance/v3' 
       and local-name()='IntuitResponse']
    /*[    namespace-uri()='http://schema.intuit.com/finance/v3' 
       and local-name()='QueryResponse']
    /@startPosition
    

    1Thanks to Daniel Haley for the namespace-uri() note.

  • Is excessively verbose.



来源:https://stackoverflow.com/questions/40796231/how-does-xpath-deal-with-xml-namespaces

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!