DomDocument class unable access domnode

后端 未结 3 1961
长情又很酷
长情又很酷 2020-12-20 08:57

I dont parse this url: http://foldmunka.net

$ch = curl_init(\"http://foldmunka.net\");

//curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RE         


        
3条回答
  •  旧巷少年郎
    2020-12-20 09:43

    Here is a solution with DomDocument and DOMXPath. It is much shorter and runs much faster (~100ms against ~2300ms) than the other solution with Simple HTML DOM Parser.

    loadHtmlFile($source);
    
        // use this instead of loadHtmlFile() to load from string:
        //$dom->loadHtml('HelloHello this sitealt attrclick Some text.');
    
        $xpath = new DOMXPath($dom);
    
        $plain = '';
    
        foreach ($xpath->query('//text()|//a|//img') as $node)
        {
            if ($node->nodeName == '#cdata-section')
                continue;
    
            if ($node instanceof DOMElement)
            {
                if ($node->hasAttribute('alt'))
                    $plain .= $node->getAttribute('alt') . ' ';
                if ($node->hasAttribute('title'))
                    $plain .= $node->getAttribute('title') . ' ';
            }
            if ($node instanceof DOMText)
                $plain .= $node->textContent . ' ';
        }
    
        return $plain;
    }
    
    echo makePlainText('http://foldmunka.net');
    

提交回复
热议问题