Finding HTML tags in string

守給你的承諾、 提交于 2019-12-28 18:24:21

问题


I know this question is around SO, but I can't find the right one and I still suck in Regex :/

I have an string and that string is valid HTML. Now I want to find all the tags with an certain name and attribute.

I tried this regex (i.e. div with type): /(<div type="my_special_type" src="(.*?)<\/div>)/.

Example string:

<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>

If I use preg_match then I only get <div type="special_type" src="bla"> match me</div> what is logical because the other one has the attributes in a different order.

What regex do I need to get the following array when using preg_match on the example string?:

array(0 => '<div type="special_type" src="bla"> match me</div>',
      1 => '<div src="blaw" type="special_type" > match me too</div>')

回答1:


A general advice: Dont use regex to parse HTML It will get messy if the HTML changes..

Use DOMDocument instead:

$str = <<<EOF
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>
EOF;

$doc = new DOMDocument();
$doc->loadHTML($str);    
$selector = new DOMXPath($doc);

$result = $selector->query('//div[@type="special_type"]');

// loop through all found items
foreach($result as $node) {
    echo $node->getAttribute('src');
}



回答2:


As hek2msql said, you better use DOMDocument

$html = '
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>';

$matches = get_matched($html);


function get_matched($html){
    $matched = array();

    $dom = new DOMDocument();
    @$dom->loadHtml($html);

    $length = $dom->getElementsByTagName('div')->length;

    for($i=0;$i<$length;$i++){
        $type = $dom->getElementsByTagName("div")->item($i)->getAttribute("type");

        if($type != 'special_type')
            continue;

        $matched[] = $dom->getElementsByTagName("div")->item($i)->getAttribute('src');
    // or   $matched[] = $dom->getElementsByTagName("div")->item($i)->nodeValue;

    }

    return $matched;

}


来源:https://stackoverflow.com/questions/18800807/finding-html-tags-in-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!