Regex to find html div class content and data-attr? (preg_match_all)

ぃ、小莉子 提交于 2019-12-23 02:29:31

问题


With preg_match_all I want to get class and data-attributes in html.

The example below works, but it only returns class names or only data-id content.

I want the example pattern to find both class and data-id content.

Which regex pattern should I use?

Html contents:

<!-- I want to: $matches[1] == test_class  | $matches[2] == null -->
<div class="test_class"> 

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" data-id="1"> 

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div id="test_id" class="test_class" data-id="1">

<!-- I want to: $matches[1] == test_class test_class2 | $matches[2] == 1 -->
<div class="test_class test_class2" id="test_id" data-id="1">

<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div data-id="1" class="test_class test_class2" id="test_id" >

<!-- I want to: $matches[1] == 1 | $matches[2] == test_class test_class2 -->
<div id="test_id" data-id="1" class="test_class test_class2">

<!-- I want to: $matches[1] == test_class | $matches[2] == 1 -->
<div class="test_class" id="test_id" data-id="1">

The regex that does not work as I want:

$pattern = '/<(div|i)\s.*(class|data-id)="([^"]+)"[^>]*>/i';

preg_match_all($pattern, $content, $matches, PREG_SET_ORDER);

Thanks in advance.


回答1:


Why not use a DOM parser instead?

You could use an XPath expression like //div[@class or @data-id] to locate the elements then extract their attribute values

$doc = new DOMDocument();
$doc->loadHTML($html);

$xpath = new DOMXpath($doc);
$divs = $xpath->query('//div[@class or @data-id]');
foreach ($divs as $div) {
  $matches = [$div->getAttribute('class'), $div->getAttribute('data-id')];
  print_r($matches);
}

Demo ~ https://eval.in/1046227




回答2:


I second Phil's answer, I think HTML parser is the way to go. It is safer and can handle much complicated things.

Having said that, if you want to try regex in your example, it would be something like this:

<(?:div|i)(?:.*?(?:class|data-id)="([^"]+)")?(?:.*?(?:class|data-id)="([^"]+)")?[^>]*>

Example: https://regex101.com/r/Gb82lF/1/



来源:https://stackoverflow.com/questions/51778425/regex-to-find-html-div-class-content-and-data-attr-preg-match-all

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!