问题
I have this following html markup:
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
and I want to catch 3,840 from last li by "Downloads:".
What do you suggest ?
My attempt:
preg_match('/<li><strong>Downloads:<\/strong>(.*?)<\/li>/s', $s, $a);
回答1:
I suggest use an HTML Parser here, DOMDocument in particular with xpath.
Example:
$markup = '<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>';
$dom = new DOMDocument();
$dom->loadHTML($markup);
$xpath = new DOMXpath($dom);
// this just simply means get the string next on that strong tag with a text of Downloads:
$download = trim($xpath->evaluate("string(//strong[text()='Downloads:']/following-sibling::text())"));
echo $download; // 3,840
回答2:
Use a html parser for parsing html files. If you insist on regex then you could try the below,
<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)
DEMO
Code:
$string = <<<EOT
<ul>
<li>
<strong>Online:</strong>
2/14/2010 3:40 AM
</li>
<li>
<strong>Hearing Impaired:</strong>
No
</li>
<li>
<strong>Downloads:</strong>
3,840
</li>
</ul>
EOT;
$regex = "~<li>[^<>]*<strong>Downloads:<\/strong>\s*\K.*?(?=\s*<\/li>)~s";
if (preg_match($regex, $string, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
} // 3,840
来源:https://stackoverflow.com/questions/26449506/php-extracting-string-between-two-tags-by-childs-content