file_get_contents script works with some websites but not others

前端 未结 3 1440
轻奢々
轻奢々 2021-01-01 04:24

I\'m looking to build a PHP script that parses HTML for particular tags. I\'ve been using this code block, adapted from this tutorial:



        
3条回答
  •  感动是毒
    2021-01-01 05:16

    Another option: Some hosts disable CURLOPT_FOLLOWLOCATION so recursive is what you want, also will log into a text file any errors. Also a simple example of how to use DOMDocument() to extract the content, obviously its not extensive but something you could build appon.

    $eline){$oline.='['.$key.']'.$eline.' ';}
        $line =$oline." \r\n ".$url."\r\n-----------------\r\n";
        $handle = @fopen('./curl.error.log', 'a');
        fwrite($handle, $line);
        return FALSE;
    }
    return $html;
    }
    
    
    function get_content_tags($source,$tag,$id=null,$value=null){
        $xml = new DOMDocument();
        @$xml->loadHTML($source);
    
        foreach($xml->getElementsByTagName($tag) as $tags) {
            if($id!=null){
                if($tags->getAttribute($id)==$value){
                    return $tags->getAttribute('content');
                }
            }
            return $tags->nodeValue;
        }
    }
    
    
    $source = file_get_site('http://www.freshdirect.com/about/index.jsp');
    
    echo get_content_tags($source,'title'); //FreshDirect
    
    echo get_content_tags($source,'meta','name','description'); //Online grocer providing high quality fresh......
    
    ?>
    

提交回复
热议问题