Python & Selenium: Iterate through list of WebElements Error: StaleElementReferenceException

一笑奈何 提交于 2021-02-11 18:22:29

问题


Good afternoon,

Somewhat new to Python and webscraping, so any help would be greatly appreciated! First:

The Code

from selenium import webdriver
import time 

chrome_path = r"/Users/ENTER/Desktop/chromedriver"

driver = webdriver.Chrome(chrome_path)

site_url = 'https://www.home-school.com/groups/'

driver.get(site_url)

# get state links from sidebar and store to list
area = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
items = area.find_elements_by_tag_name('a')

# remove unneeded links
del items[:22]
del items[-1:]

# 
for links in items:
    # print(links.text)
    print(links.get_attribute("href"))
    # add link related logic here
    links.click()
    # you have to wait for the next element to display
    time.sleep(4)
    # assign html container with desired data to variable
    element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[4]/div""")
    # Store container text in variable. We skip the first 5 lines of text as they 
    #  are unnecessary.
    orgdata = element.text.split("\n",5)[5]
    orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
    # Write data to text file
    filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + links.text + '.txt'
    file_object = open(filepath, 'a')
    file_object.write(orgdata)

The Problem

I am using Selenium in an attempt to save the names and information of homeschool groups from http://home-school.com/groups/ to individual text files per state.

To do this, I have saved a list of links and would like to iterate through the list to click each link, perform tasks related to scraping the desired data, manipulating the text, and outputting to separate text files per state.

I am getting StaleElementReferenceException: stale element reference: element is not attached to the page document when attempting to performing the "for" Loop.

I believe it is giving the error when it gets to element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div"""). As far as I can tell, this xpath does not change. I assumed I needed to make the webdriver wait for the page to load, hence time.sleep(4).

I'm sure this is a simple fix that will make sense when I see it, but at the moment I am stumped. Any help you all can offer would be awesome! Thank you!


回答1:


Try it

from selenium import webdriver
import time 

chrome_path = r"/Users/ENTER/Desktop/chromedriver"

driver = webdriver.Chrome(chrome_path)

site_url = 'https://www.home-school.com/groups/'

driver.get(site_url)

# get state links from sidebar and store to list
area = driver.find_element_by_xpath("/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div")
items = area.find_elements_by_tag_name('a')

# remove unneeded links
del items[:22]
del items[-1:]

text_list = [i.text for i in items]
items = [i.get_attribute("href") for i in items]

for i in range(len(items)):
    driver.get(items[i])
    # you have to wait for the next element to display
    time.sleep(2)
    # assign html container with desired data to variable
    element = driver.find_element_by_xpath("""/html/body/center/table/tbody/tr/td/table[3]/tbody/tr/td[2]/div""")
    # Store container text in variable. We skip the first 5 lines of text as they 
    #  are unnecessary.
    orgdata = element.text.split("\n",5)[5]
    orgdata = orgdata.replace(' Edit Remove More', '').replace(' Edit Remove', '')
    # Write data to text file
    filepath = '/Users/ENTER/Documents/STEMBoard/Tiger Team/Lingo/' + text_list[i] + '.txt'
    file_object = open(filepath, 'a')
    file_object.write(orgdata)
    file_object.close()


来源:https://stackoverflow.com/questions/62181398/python-selenium-iterate-through-list-of-webelements-error-staleelementrefere

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!