Web crawler capable of interpreting Javascript in python for Windows

戏子无情 提交于 2019-12-20 10:55:46

问题


My ultimate goal is to build a web crawler capable of downloading all of the images on a webpage. My understanding from the reading I've done is that I need to embed a rendering/layout engine such as Gecko or Webkit.

Unfortunately, I'm running windows, so PyWebkit is out and short learning C++ for Gecko or Java to use Rhino, I'm not sure where to turn.

Is there a reliable rendering engine with python bindings that will work in windows (64-bit, Windows 7)? Is there an easy way to execute javascript within a python script on windows?


回答1:


You don't need Webkit to do that. All you need it an engine to run Javascript code, so take a look at Gogole V8 or Mozilla SpiderMonkey.

If you're prefer Python to build your crawler, you may want to use PyV8 as it provides all necessary bindings.



来源:https://stackoverflow.com/questions/4998566/web-crawler-capable-of-interpreting-javascript-in-python-for-windows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!