PHP Simple HTML DOM Scrape External URL

帅比萌擦擦* 提交于 2019-12-13 04:42:45

问题


I'm trying to build a personal project of mine, however I'm a bit stuck when using the Simple HTML DOM class.

What I'd like to do is scrape a website and retrieve all the content, and it's inner html, that matches a certain class.

My code so far is:

    <?php
    error_reporting(E_ALL);
    include_once("simple_html_dom.php");
    //use curl to get html content
    $url = 'http://www.peopleperhour.com/freelance-seo-jobs';

    $html = file_get_html($url);

    //Get all data inside the <div class="item-list">
    foreach($html->find('div[class=item-list]') as $div) {
    //get all div's inside "item-list"
    foreach($div->find('div') as $d) {
    //get the inner HTML
    $data = $d->outertext;
    }
    }
print_r($data)
    echo "END";
    ?>

All I get with this is a blank page with "END", nothing else outputted at all.


回答1:


I think, you may want something like this

$url = 'http://www.peopleperhour.com/freelance-seo-jobs';
$html = file_get_html($url);
foreach ($html->find('div.item-list div.item') as $div) {
    echo $div . '<br />';
};

This will give you something like this (if you add the proper style sheet, it'll be displayed nicely)




回答2:


It seems your $data variable is being assigned a different value on each iteration. Try this instead:

$data = "";
foreach($html->find('div[class=item-list]') as $div) {
    //get all divs inside "item-list"
    foreach($div->find('div') as $d) {
         //get the inner HTML
         $data .= $d->outertext;
    }
}
print_r($data)

I hope that helps.



来源:https://stackoverflow.com/questions/20475027/php-simple-html-dom-scrape-external-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!