How to make Selenium not wait till full page load, which has a slow script?

隐身守侯 提交于 2019-11-25 23:49:59

问题


Selenium driver.get (url) wait till full page load. But a scraping page try to load some dead JS script. So my Python script wait for it and doesn\'t works few minutes. This problem can be on every pages of a site.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get(\'https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000\')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js 
driver.find_element_by_name(\'ANCHO\').send_keys(\"100\")

How to limit the time wait, block AJAX load of a file, or is other way?

Also I test my script in webdriver.Chrome(), but will use PhantomJS(), or probably Firefox(). So, if some method uses a change in browser settings, then it must be universal.


回答1:


When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:

  1. normal (full page load)
  2. eager (interactive)
  3. none

Here is the code block to configure the pageLoadStrategy :

  • Firefox :

    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
    
    caps = DesiredCapabilities().FIREFOX
    caps["pageLoadStrategy"] = "normal"  #  complete
    #caps["pageLoadStrategy"] = "eager"  #  interactive
    #caps["pageLoadStrategy"] = "none"
    driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe')
    driver.get("http://google.com")
    
  • Chrome :

    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
    
    caps = DesiredCapabilities().CHROME
    caps["pageLoadStrategy"] = "normal"  #  complete
    #caps["pageLoadStrategy"] = "eager"  #  interactive
    #caps["pageLoadStrategy"] = "none"
    driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe')
    driver.get("http://google.com")
    

Note : pageLoadStrategy values normal, eager and none is a requirement as per WebDriver W3C Editor's Draft but pageLoadStrategy value as eager is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in “Eager” Page Load Strategy workaround for Chromedriver Selenium in Python




回答2:


Selenium Webdriver provides two types of waits - implicit & explicit. An explicit wait makes WebDriver wait for a certain condition to occur before proceeding further with execution.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Firefox()
driver.get("http://somedomain/url_that_delays_loading")
try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "myDynamicElement"))
    )
finally:
    driver.quit()

This waits up to 10 seconds before throwing a TimeoutException unless it finds the element to return within 10 seconds.

So a solution might be to set a time to wait and if the element doesn't get caught in that fixed period, catch the exception and log the event or nothing and finally proceed on. The code sample has been taken from here



来源:https://stackoverflow.com/questions/44770796/how-to-make-selenium-not-wait-till-full-page-load-which-has-a-slow-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!