Weird error using PHP Simple HTML DOM parser

前端 未结 9 1393
梦谈多话
梦谈多话 2020-11-29 10:30

I am using this library (PHP Simple HTML DOM parser) to parse a link, here\'s the code:

function getSemanticRelevantKeywords($keyword){
    $results = array(         


        
9条回答
  •  离开以前
    2020-11-29 11:07

    Before file_get_html/load_file method, you should first check if URL exists or not.

    If the URL exists, you pass one step.
    (Some servers, service a 404 page a valid HTML page. which has propriate HTML page structure like body, head, etc. But it has only text "This page couldn'!t find. 404 error bla bla..)

    If URL is 200-OK, then you should check whether fetched thing is object and whether nodes are set.

    That's the code i used in my pages.

    function url_exists($url){
        if ((strpos($url, "http")) === false) $url = "http://" . $url;
        $headers = @get_headers($url);
        // print_r($headers);
        if (is_array($headers)){
            if(strpos($headers[0], '404 Not Found'))
                return false;
            else
                return true;    
        }         
        else
            return false;
    }
    
    $pageAddress='http://www.google.com';
    if ( url_exists($pageAddress) ) {
        $htmlPage->load_file( $pageAddress );
    } else {
        echo 'url doesn t exist, i stop';
        return;
    }
    
    if( $htmlPage && is_object($htmlPage) && isset($htmlPage->nodes) )
    {
        // do your work here...
    } else {
        echo 'fetched page is not ok, i stop';
        return;
    }
    

提交回复
热议问题