I can't grab specific URL in search page

北城余情 提交于 2019-12-25 00:23:33

问题


I enter the estate website and searched by name of the city. After that I want to grab Osaka City building URL. In here http://brillia.com/search/?area=27999 There are four of those. 

And I m using that link to grab URL.

$allDivs = $parser->getElementsByTagName('div');
    foreach ($allDivs as $div) {
        if ($div->getAttribute('class') == 'boxInfomation') {
            $allLinks = $div->getElementsByTagName('a');
            foreach ($allLinks as $a) {
                $linkler[] = $a->getAttribute('href');
            }
        }
    }

But I cant grab those. Actually I grabbed not just osaka city pages URL actually grabbed all of it. When I try to see the source the osaka page site. It shows http://brillia.com/search/ Thats why I m grabbing all other links...

But how can I grab just URLs in here -> http://brillia.com/search/?area=27999

Any idea? Thank you.


回答1:


Can you do this by using jQuery? in that case this grab the a href

 $("div h3 a").each(function(){
    var link = $(this).attr("href");
    console.log(link);
 });

here a jsfiddle test




回答2:


The parser relies on libxml to extract elements but that page is using html5 heavily, ommiting certain close tags, etc and that isn't really strict xml, so it's struggling to "correct mistakes" by guessing where to close missing tags, returning wrong results.

You need a parser with html5 support like HTML5DOMDocument that extends DOMDocument and should have mostly the same interface.



来源:https://stackoverflow.com/questions/52659703/i-cant-grab-specific-url-in-search-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!