How to use CrawlSpider from scrapy to click a link with javascript onclick?

僤鯓⒐⒋嵵緔 提交于 2019-12-03 04:34:20
Orochi

The actual methodology will be as follows:

  1. Post your request to reach the page (as you are doing)
  2. Extract link to the next page from that particular response
  3. Simple Request the next page if possible or use FormRequest again in applicable

All this have to be streamlined with the server response mechanism, e.g:

  • You can try using dont_click = true in FormRequest.from_response
  • Or you may want to handle the redirection (302) coming from the server (in which case you will have to mention in the meta that you require the handle redirect request also to be sent to callback.)

Now how to figure it all out: Use a web debugger like fiddler or you can use Firefox plugin FireBug, or simply hit F12 in IE 9; and check the requests a user actually makes on the website match the way you are crawling the webpage.

I built a quick crawler that executes JS via selenium. Feel free to copy / modify https://github.com/rickysahu/seleniumjscrawl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!