Extracting specific data from a web page using PHP [duplicate]

我只是一个虾纸丫 提交于 2019-12-20 10:48:30

问题


Possible Duplicate:
HTML Scraping in Php

I would like to know if is there any way to get from a webpage a specific string of text wich is updated every now and then using PHP. I´ve searched "all over the internet" and have found nothing. Just saw that preg_match could do it, but I didn't understand how to use it.

imagine that a webpage contains this:

<div name="changeable_text">**GET THIS TEXT**</div>

How can I do it using PHP, after having used file_get_contents to put the page in a variable?

Thanks in advance :)


回答1:


You can use DOMDocument, like this:

$html = file_get_contents( $url);

libxml_use_internal_errors( true);
$doc = new DOMDocument;
$doc->loadHTML( $html);
$xpath = new DOMXpath( $doc);

// A name attribute on a <div>???
$node = $xpath->query( '//div[@name="changeable_text"]')->item( 0);

echo $node->textContent; // This will print **GET THIS TEXT**



回答2:


You might want to have a look at the

Simple HTML DOM Library

There is a little tutorial here: http://www.developertutorials.com/tutorials/php/easy-screen-scraping-in-php-simple-html-dom-library-simplehtmldom-398/

That one is a screen scraping API that lets you feed html to it and then get parts of it in a jQuery similiar language.




回答3:


You’re talking about data scraping: the act of extracting data from a human readable output. In your case this is whatever is between the <div> tags. Use PHP DOM’s extension to get to the tag you want and extract data. Google search for a PHP DOM tutorial.




回答4:


$delements= file_get_html('url will go here'); 

foreach($elements->find('element') as $ele) {

    //traverse according to your preferences

} 

//return or output


来源:https://stackoverflow.com/questions/11567632/extracting-specific-data-from-a-web-page-using-php

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!