PhantomJS returning empty web page (python, Selenium)

后端未结

关注

 3  1635

感动是毒 2020-12-15 06:51

Trying to screen scrape a web site without having to launch an actual browser instance in a python script (using Selenium). I can do this with Chrome or Firefox - I\'ve trie

3条回答

失恋的感觉 (楼主)

2020-12-15 07:18
You need to wait for the page to load. Usually, it is done by using an Explicit Wait to wait for a key element to be present or visible on a page. For instance:
```
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC


# ...
browser.get("https://www.whatever.com")

wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.content")))

html_source = browser.page_source
# ...
```
Here, we'll wait up to 10 seconds for a div element with class="content" to become visible before getting the page source.

Additionally, you may need to ignore SSL errors:
```
browser = webdriver.PhantomJS(desired_capabilities=dcap, service_args=['--ignore-ssl-errors=true'])
```
Though, I'm pretty sure this is related to the redirecting issues in PhantomJS. There is an open ticket in phantomjs bugtracker:
- PhantomJS does not follow some redirects
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...