“Eager” Page Load Strategy workaround for Chromedriver Selenium in Python

南楼画角 提交于 2020-01-20 06:55:25

问题


I want to speed up the loading time for pages on selenium because I don't need anything more than the HTML (I am trying to scrape all the links using BeautifulSoup). Using PageLoadStrategy.NONE doesn't work to scrape all the links, and Chrome no longer supports PageLoadStrategy.EAGER. Does anyone know of a workaround to get PageLoadStrategy.EAGER in python?


回答1:


ChromeDriver is the standalone server which implements WebDriver's wire protocol for Chromium. Chrome and Chromium are still in the process of implementing and moving to the W3C standard. Currently ChromeDriver is available for Chrome on Android and Chrome on Desktop (Mac, Linux, Windows and ChromeOS).

As per the current WebDriver W3C Editor's Draft The following is the table of page load strategies that links the pageLoadStrategy capability keyword to a page loading strategy state, and shows which document readiness state that corresponds to it:

However, if you observe the current implementation of of ChromeDriver, the Chrome DevTools does takes into account the following document.readyStates:

  • document.readyState == 'complete'
  • document.readyState == 'interactive'

Here is a sample relevant log:

[1517231304.270][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=11) {
   "expression": "var isLoaded = document.readyState == 'complete' ||    document.readyState == 'interactive';if (isLoaded) {  var frame = document.createElement('iframe');  frame.name = 'chromedriver dummy frame'; ..."
}

As per WebDriver Status you will find the list of all WebDriver commands and their current support in ChromeDriver based on what is in the WebDriver Specification. Once the implementation are completed from all aspects PageLoadStrategy.EAGER is bound to be functionally present within Chrome Driver.




回答2:


You only use normal or none as the pageLoadStrategy in chromdriver. So either choose none and handle everything yourself or wait for the page load as it normally happens



来源:https://stackoverflow.com/questions/51087832/eager-page-load-strategy-workaround-for-chromedriver-selenium-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!