问题
On a personal project am working on am in a situation where i need to scrape names of items from a dynamic site using selenium.
In order to get all the data you need to scroll to the bottom.
However it gets trickier if you scroll to the bottom to quickly you only get names of items at the bottom. It doesn't matter how long you wait you still get the items on scope.
So i figured i could scroll to the bottom slowly however it doesn't seem to work.
Here is my demo code to illustrate the problem
url='https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)
#driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # this works but you get items at bottom only
#this scrolls slowly to end
driver.execute_script("function pageScroll() {window.scrollBy(0,50);scrolldelay = setTimeout('pageScroll()',1000);}pageScroll()")
time.sleep(2)
products =driver.find_elements_by_class_name("grid-product__content")
for product in products:
name=product.find_element_by_class_name("grid-product__title").text
print(name)
Any ideas?
Extra (imports and configs)
import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
回答1:
Solution
page = driver.find_element_by_tag_name("html")
page.send_keys(Keys.END)
How its applied in this situation
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)
url = 'https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)
WebDriverWait(driver, 10).until(ec.element_to_be_clickable((By.XPATH, "//button[text()='No thanks']"))).click()
page = driver.find_element_by_tag_name("html")
page.send_keys(Keys.END)
products = driver.find_elements_by_class_name("grid-product__content")
for product in products:
name = product.find_element_by_class_name("grid-product__title").text
print(name)
page.send_keys(Keys.END)
来源:https://stackoverflow.com/questions/61563931/how-to-scroll-to-the-end-of-a-page-slowly-using-selenium-so-that-i-can-get-dynam