DomXPath with DOMDocument to get <img> Class URL

♀尐吖头ヾ 提交于 2019-12-25 02:59:24

问题


I am writing a little scraper script that will find the image URL that has a particular class name. I know that my cURL and DOMDocument is functioning okay, and even the DomXPath really (as far as I can tell, there are no errors) But I am struggling to work out how to get the URL of the xpath query results.

My code so far:

$dom = new DOMDocument();
@$dom->loadHTML($x);

$xpath = new DomXpath($dom);
$div = $xpath->query('//*[@class="productImage"]');


var_dump($div);
echo $div->item(0);

If I var_dump($x) the page outputs no problem. So the CURL is working fine. But I do not know how to get the data that is contained in the $div. I am trying to find an Image with a class of 'productImage' which looks like:

<img src="/uploads/5W/yP/5WyPP4l7Z-jmZRzu_MJ6zg/1077-d.jpg" border="1" alt="Album" class="productImage">

I want the source of that image tag.

Any suggestions?


回答1:


$dom = new DOMDocument();
$dom->loadHTML($x);

$xpath = new DomXpath($dom);
$imgs  = $xpath->query('//*[@class="productImage"]');

foreach($imgs as $img)
{
    echo 'ImgSrc: ' . $img->getAttribute('src') .'<br />' . PHP_EOL;
}

Try that...

== EDIT: Additional Info ==

The reason I use a loop here is because you may find more than one img. If you know there is only one element (or you want the first dom node found) you can access the elelement from the domnodelist via the item method of domnodelist - like so:

$dom = new DOMDocument();
$dom->loadHTML($x);

$xpath = new DomXpath($dom);
$img   = $xpath->query('//*[@class="productImage"]');

echo 'ImgSrc: ' . $img->item(0)->getAttribute('src') .'<br />' . PHP_EOL;



回答2:


You don't actually need to use XPath here, because it seems that you're just after images and that can be done by using DOMDocument::getElementsByTagName(), followed by a simple filter:

foreach ($dom->getElementsByTagName('img') as $image) {
    $class = $image->getAttribute('class');
    if (strpos(" $class ", " productImage ") !== false) {
        $url = $image->getAttribute('src');
        // do stuff
    }
}

Then, you can get the src attribute by using DOMElement::getAttribute():

echo $image->getAttribute('src');


来源:https://stackoverflow.com/questions/16054856/domxpath-with-domdocument-to-get-img-class-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!