PhantomJS returning empty web page (python, Selenium)

后端 未结 3 1632
感动是毒
感动是毒 2020-12-15 06:51

Trying to screen scrape a web site without having to launch an actual browser instance in a python script (using Selenium). I can do this with Chrome or Firefox - I\'ve trie

3条回答
  •  失恋的感觉
    2020-12-15 07:18

    You need to wait for the page to load. Usually, it is done by using an Explicit Wait to wait for a key element to be present or visible on a page. For instance:

    from selenium.webdriver.support.wait import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    
    # ...
    browser.get("https://www.whatever.com")
    
    wait = WebDriverWait(driver, 10)
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.content")))
    
    html_source = browser.page_source
    # ...
    

    Here, we'll wait up to 10 seconds for a div element with class="content" to become visible before getting the page source.


    Additionally, you may need to ignore SSL errors:

    browser = webdriver.PhantomJS(desired_capabilities=dcap, service_args=['--ignore-ssl-errors=true'])
    

    Though, I'm pretty sure this is related to the redirecting issues in PhantomJS. There is an open ticket in phantomjs bugtracker:

    • PhantomJS does not follow some redirects

提交回复
热议问题