How to scroll to the end of a page slowly using selenium so that i can get dynamically loaded content

心已入冬 提交于 2021-01-29 08:45:07

问题


On a personal project am working on am in a situation where i need to scrape names of items from a dynamic site using selenium.

In order to get all the data you need to scroll to the bottom.

However it gets trickier if you scroll to the bottom to quickly you only get names of items at the bottom. It doesn't matter how long you wait you still get the items on scope.

So i figured i could scroll to the bottom slowly however it doesn't seem to work.

Here is my demo code to illustrate the problem

url='https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)

#driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # this works but you get items at bottom only
#this scrolls slowly to end
driver.execute_script("function pageScroll() {window.scrollBy(0,50);scrolldelay = setTimeout('pageScroll()',1000);}pageScroll()")
time.sleep(2)
products =driver.find_elements_by_class_name("grid-product__content")
for product in products:
    name=product.find_element_by_class_name("grid-product__title").text
    print(name)

Any ideas?

Extra (imports and configs)

import time
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)

回答1:


Solution

 page = driver.find_element_by_tag_name("html")
 page.send_keys(Keys.END)

How its applied in this situation

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--window-size=1420,1080')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(chrome_options=chrome_options)

url = 'https://shopzetu.com/search?type=product,article,page&q=dress'
driver.get(url)

WebDriverWait(driver, 10).until(ec.element_to_be_clickable((By.XPATH, "//button[text()='No thanks']"))).click()
page = driver.find_element_by_tag_name("html")
page.send_keys(Keys.END)
products = driver.find_elements_by_class_name("grid-product__content")
for product in products:
    name = product.find_element_by_class_name("grid-product__title").text
    print(name)
page.send_keys(Keys.END)


来源:https://stackoverflow.com/questions/61563931/how-to-scroll-to-the-end-of-a-page-slowly-using-selenium-so-that-i-can-get-dynam

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!