state of HTML after onload javascript

回眸只為那壹抹淺笑 提交于 2019-12-11 08:32:34

问题


many webpages use onload JavaScript to manipulate their DOM. Is there a way I can automate accessing the state of the HTML after these JavaScript operations?

A took like wget is not useful here because it just downloads the original source. Is there perhaps a way to use a web browser rendering engine?

Ideally I am after a solution that I can interface with from Python.

thanks!


回答1:


The only good way I know to do such things is to automate a browser, for example via Selenium RC. If you have no idea of how to deduce that the page has finished running the relevant javascript, then, just a real live user visiting that page, you'll just have to wait a while, grab a snapshot, wait some more, grab another, and check there was no change between them to convince yourself that it's really finished.




回答2:


Please see related info at stackoverflow:

  • screen-scraping
  • Screen Scraping from a web page with a lot of Javascript


来源:https://stackoverflow.com/questions/1436211/state-of-html-after-onload-javascript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!