file_get_contents script works with some websites but not others

前端未结
关注
 3  1440
轻奢々 2021-01-01 04:24
I\'m looking to build a PHP script that parses HTML for particular tags. I\'ve been using this code block, adapted from this tutorial:

      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   感动是毒
                                             
                
                
                (楼主)
            
              
              
                2021-01-01 05:16
              

            
            
                        
Another option: Some hosts disable CURLOPT_FOLLOWLOCATION so recursive is what you want, also will log into a text file any errors. Also a simple example of how to use DOMDocument() to extract the content, obviously its not extensive but something you could build appon.

$eline){$oline.='['.$key.']'.$eline.' ';}
    $line =$oline." \r\n ".$url."\r\n-----------------\r\n";
    $handle = @fopen('./curl.error.log', 'a');
    fwrite($handle, $line);
    return FALSE;
}
return $html;
}


function get_content_tags($source,$tag,$id=null,$value=null){
    $xml = new DOMDocument();
    @$xml->loadHTML($source);

    foreach($xml->getElementsByTagName($tag) as $tags) {
        if($id!=null){
            if($tags->getAttribute($id)==$value){
                return $tags->getAttribute('content');
            }
        }
        return $tags->nodeValue;
    }
}


$source = file_get_site('http://www.freshdirect.com/about/index.jsp');

echo get_content_tags($source,'title'); //FreshDirect

echo get_content_tags($source,'meta','name','description'); //Online grocer providing high quality fresh......

?>

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复