Basic DOMXpath, can be wrong? (re: check input namespace always)

耗尽温柔 提交于 2019-12-23 04:34:00

问题


I am using a internal domDocument into a class, $this->doc->dom, and I think that is ok because $this->doc->dom->saveXML() works, and show my XML, something like

  <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
  <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
      <title>04</title>
      <link href="css/04.css" rel="stylesheet" type="text/css"/>
    </head>
   ...

And when I use

 $xpath = new DOMXpath($this->doc->dom);
 $elements = $xpath->query('//link'); 

no error reported... But, no elements (!),

   print $elements->length;

show 0 (zero). That is the problem, and for me is a DomDocument BUG: the <link ../> element is there!


Edit to add more clues...

When I do similar thing with getElementsByTagName() it works (!), so, is not a problem with the $this->doc->dom.

 $test = $this->doc->dom->getElementsByTagName('link');
 print $test->length; // OK, not zero, returns 1!

回答1:


It is not a "DomDocument bug".

Simple solutions

Consolidating the posted comments.

Register the namespace

(@PaulT answer)   The root (html tag) have a namespace declared, xmlns="http://www.w3.org/1999/xhtml". With registerNamespace() you can register it, with an arbitrary nickname (xx), then do a correct query

$xpath->registerNamespace('xx', "http://www.w3.org/1999/xhtml"); 
$xpath->query('//xx:link');

Remove namespace attribute from root

I an filtering my input, so it changed to

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<html>
 <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
    <title>04</title>
    <link href="css/04.css" rel="stylesheet" type="text/css"/>
 </head>
...
</html>

now is working as I expected, without need of "damn namespaces".

Debugging XPath

(@RolandoIsidoro answer)   When in trouble in cases like these try a tool like freeformatter.com/xpath-tester.html. In your example it throws an error that would have lead you to the solution:

The default (no prefix) Namespace URI for XPath queries is always '' and it cannot be redefined to 'http://www.w3.org/1999/xhtml'



来源:https://stackoverflow.com/questions/18722553/basic-domxpath-can-be-wrong-re-check-input-namespace-always

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!