How to make mechanize wait for web-page 'full' load?

前端 未结 2 775
半阙折子戏
半阙折子戏 2021-02-20 07:47

I want to scrape some web page which loads its components dynamically. This page has an onload script, and I can see the complete page 3-5 seconds after typing the URL into my b

2条回答
  •  执笔经年
    2021-02-20 08:02

    Working a webpage with a rich javascripts content with mechanize is not much easy, but there are ways to get what you want according to different situations.

    • If some json requests are made to create the content, then you can call that urls and try to parse responses to get content, then try to join it properly.

    • If you need to use some forms, you can create some form fields and set their values within mechanize. Or , simply write a method that will encode your POST or GET data (quote special characters etc..) and send them with mechanize.browser.open method.

    • If page has some javascript based security functions (like some special encoding to form data before posting them), then you may use node.js like javascript application servers to process some javascript code blocks.

    But in fact, some of the above options are not easy to do, and you must think twice before using mechanize for such projects.

提交回复
热议问题