问题
I am trying to get the contents using XPATH in php.
<div class='post-body entry-content' id='post-body-37'>
<div style="text-align: left;">
<div style="text-align: center;">
Hi
</div></div></div>
I am using below php code to get the output.
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$xpath->registerPhpFunctions('preg_match');
$regex = 'post-(content|[a-z]+)';
$items = $xpath->query("div[ php:functionString('preg_match', '$regex', @class) > 0]");
dd($items);
It returns output as below
DOMNodeList {#580
+length: 0
}
回答1:
For a simple task like this - getting the div
nodes with class
attribute starting with post-
and containing content
, you should be using regular simple XPath queries:
$xp->query('//div[starts-with(@class,"post-") and contains(@class, "content")]');
Here,
- //div
- get all div
s that...
- starts-with(@class,"post-")
- have "class" attribute starting with "post-"
- and
- and...
- contains(@class, "content")
- contain "content" substring in the class
attribute value.
To use the php:functionString
you need to register the php
namespace (with $xpath->registerNamespace("php", "http://php.net/xpath");
) and the PHP functions (to register them all use $xp->registerPHPFunctions();
).
For complex scenrios, when you need to analyze the values even deeper, you may want to create and register your own functions:
function example($attr) {
return preg_match('/post-(content|[a-z]+)/i', $attr) > 0;
}
and then inside XPath:
$divs = $xp->query("//div[php:functionString('example', @class)]");
Here, functionString
passes the string contents of @class
attribute to the example
function, not the object (as would be the case with php:function
).
See IDEONE demo:
function example($attr) {
return preg_match('/post-(content|[a-z]+)/i', $attr) > 0;
}
$html = <<<HTML
<body>
<div class='post-body entry-content' id='post-body-37'>
<div style="text-align: left;">
<div style="text-align: center;">
Hi
</div></div></div>
</body>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED|LIBXML_HTML_NODEFDTD);
$xp = new DOMXPath($dom);
$xp->registerNamespace("php", "http://php.net/xpath");
$xp->registerPHPFunctions('example');
$divs = $xp->query("//div[php:functionString('example', @class)]");
foreach ($divs as $div) {
echo $div->nodeValue;
}
See also a nice article about the using of PhpFunctions inside XPath in Using PHP Functions in XPath Expressions.
回答2:
Here is a working version with the different advices you get in comments:
libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
// you need to register the namespace "php" to make it available in the query
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPhpFunctions('preg_match');
// add delimiters to your pattern
$regex = '~post-(content|[a-z]+)~';
// search your node anywhere in the DOM tree with "//"
$items = $xpath->query("//div[php:functionString('preg_match', '$regex', @class)>0]");
var_dump($items);
Obviously, this kind of pattern is useless since you can get the same result with available XPATH string functions like contains
.
来源:https://stackoverflow.com/questions/33409244/creating-preg-match-using-xpath-in-php