How should parse with PHP (simple html dom parser) background images and other images of webpage?

前端 未结 2 608
佛祖请我去吃肉
佛祖请我去吃肉 2020-12-18 15:46

How should parse with PHP (simple html dom/etc..) background and other images of webpage?

case 1: inline css



        
相关标签:
2条回答
  • 2020-12-18 16:05

    For Case 1:

    // Create DOM from URL or file 
    $html = file_get_html('http://www.google.com/');
    
    // Get the style attribute for the item
    $style = $html->getElementById("id100")->getAttribute('style');
    
    // $style = background:url(/mycar1.jpg)
    // You would now need to put it into a css parser or do some regular expression magic to get the values you need.
    

    For Case 2/3:

    // Create DOM from URL or file
    $html = file_get_html('http://www.google.com/');
    
    // Get the Style element
    $style = $html->find('head',0)->find('style');
    
    // $style now contains an array of style elements within the head. You will need to work out using attribute selectors what whether an element has a src attribute, if it does download the external css file and parse (using a css parser), if it doesnt then pass the innertext to the css parser.
    
    0 讨论(0)
  • 2020-12-18 16:23

    To extract <img> from the page you can try something like:

    $doc = new DOMDocument(); 
    $doc->loadHTML("<html><body>Foo<br><img src=\"bar.jpg\" title=\"Foo bar\" alt=\"alt\"></body></html>"); 
    $xml = simplexml_import_dom($doc);
    $images = $xml->xpath('//img'); 
    foreach ($images as $img) 
        echo $img['src'] . ' ' . $img['alt'] . ' ' . $img['title']; 
    

    See doc for DOMDocument for more details.

    0 讨论(0)
提交回复
热议问题