问题
I'm working on some code in which I use Selenium web driver - Firefox. Most of things seems to work but when I try to change the browser to PhantomJS, It starts to behave differently.
The page I'm processing is needed to be scrolled slowly to load more and more results and that's probably the problem.
Here is the code which works with Firefox webdriver, but doesn't work with PhantomJS:
def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d
return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)
def load_whole_page(self,destination,start_date,end_date):
deb()
url = get_url(destination,start_date,end_date)
self.driver.maximize_window()
self.driver.get(url)
wait = WebDriverWait(self.driver, 60)
wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
i=0
old_driver_html = ''
end = False
while end==False:
i+=1
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
break
try:
self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])
except:
self.driver.save_screenshot('screen_before_'+str()+'.png')
sleep(2)
print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
continue
new_driver_html = self.driver.page_source
if new_driver_html == old_driver_html:
print 'END OF PAGE'
break
old_driver_html = new_driver_html
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
sleep(10)
To detect when the page is full loaded, I compare old copy of html and new html which is probably not what I'm supposed to do but with Firefox it is sufficient.
Here is the screen of PhantomJS when the loading is stopped:

With Firefox, it loads more and more results, but with PhantomJS it is stucked on for example 10 results.
Any ideas? What are the differences between these two drivers?
回答1:
Two key things that helped me to solve it:
- do not use that custom wait I've helped you with before
- set the
window.document.body.scrollTop
first to 0 and then todocument.body.scrollHeight
in a row
Working code:
results = []
while len(results) < 200:
results = driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
driver.execute_script("arguments[0].scrollIntoView();", results[0])
driver.execute_script("window.document.body.scrollTop = 0;")
driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
driver.execute_script("arguments[0].scrollIntoView();", results[-1])
Version 2 (endless loop, stop if there is nothing loaded on scroll anymore):
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
try:
self.driver.execute_script("""
arguments[0].scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments[1].scrollIntoView();
""", results[0], results[-1])
except StaleElementReferenceException:
break # here it means more results were loaded
print "DONE. Result count: %d" % len(results)
Note that I've changed the comparison in the wait_for_more_than_n_elements
expected condition. Replaced:
return count >= self.count
with:
return count > self.count
Version 3 (scrolling from header to footer multiple times):
header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
self.driver.execute_script("""
arguments[0].scrollIntoView();
arguments[1].scrollIntoView();
""", header, footer)
sleep(1)
来源:https://stackoverflow.com/questions/31371460/phantomjs-acts-differently-than-firefox-webdriver