How to scrape a javascript site using PHP, CURL [duplicate]

独自空忆成欢 提交于 2019-12-20 07:59:19

问题


Possible Duplicate:
How do I render javascript from another site, inside a PHP application?

This is the site http://www.oferta.pl/strona_v2/gazeta_v2/ . This site is built totally on JavaScript. I want to scrape using PHP and curl. Currently I use DOMXPath. In the left menu there are some category to be selected. I see no 'form' there. How can I use curl to submit that form and scrap the output page?

I have used file_get_contents() only. It doesn't get all of the page. How can I proceed?

N.B : http://www.html-form-guide.com/php-form/php-form-submit.html I have found this example which have a 'form'. But my specified site has no 'form'.


回答1:


You can not scrape it. Its possible. But its way too hard.

  1. Simulate the http request by curl. Check every request it makes by ajax and try to simulate it.

  2. Simulate Javascript executions (this part is almost impossible). Some requests contains values which are generated by Javascript. You need to do it in php. If they has some complicated algorithm implemented in JS you can invoke v8 javascript engine.



来源:https://stackoverflow.com/questions/9342719/how-to-scrape-a-javascript-site-using-php-curl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!