php: Parse string from html

后端 未结 3 419
借酒劲吻你
借酒劲吻你 2021-01-03 10:13

I have opened an HTML file using

file_get_contents(\'http://www.example.com/file.html\')

and want to parse the line including \"ParseThis\"

相关标签:
3条回答
  • 2021-01-03 10:58

    You can use DOM for this.

    // Load remote file, supress parse errors
    libxml_use_internal_errors(TRUE);
    $dom = new DOMDocument;
    $dom->loadHTMLFile('http://www.example.com/file.html');
    libxml_clear_errors();
    
    // use XPath to find all nodes with a class attribute of header
    $xp = new DOMXpath($dom);
    $nodes = $xp->query('//h1[@class="header"]');
    
    // output first item's content
    echo $nodes->item(0)->nodeValue;
    

    Also see

    • Best methods to parse HTML
    • More examples by me with DOM.

    Marking this CW because I have answered this before, but I am too lazy to find the duplicate

    0 讨论(0)
  • 2021-01-03 11:05

    Since it is the first h1 tag, getting it should be fairly trivial:

    $doc = new DOMDocument();
    $doc->loadHTML($html);
    $h1 = $doc->getElementsByTagName('h1');
    echo $h1->item(0)->nodeValue;
    

    http://php.net/manual/en/class.domdocument.php

    0 讨论(0)
  • 2021-01-03 11:08

    Use this function.

    <?php
    function get_string_between($string, $start, $end)
    {
        $string = " ".$string;
        $ini = strpos($string,$start);
        if ($ini == 0)
            return "";
        $ini += strlen($start);
        $len = strpos($string,$end,$ini) - $ini;
        return substr($string,$ini,$len);
    }
    
    $data = file_get_contents('http://www.example.com/file.html');
    
    echo get_string_between($data, '<h1 class=\"header\">', '<\/h1>');
    
    0 讨论(0)
提交回复
热议问题