For loop keeps iterating over first piece of data in list

限于喜欢 提交于 2020-04-30 06:32:20

问题


This code scrapes the HTML table from https://www.asx.com.au/asx/statistics/prevBusDayAnns.do and downloads PDF files for specific ASX Codes and Headlines. When the for loop iterates over the ASX Codes found in 'data', it iterates over the first ASX Code five times which creates five duplicate of the same PDF. For example, in the code below there would be five copies of TWD. The amount of times the for loop iterates over the first ASX code is equal to the amount of ASX Codes in 'data'. For example, if there were ten codes, I would end up with ten copies of PDF files for TWD. This only happens to the first ASX Code, everything else is fine. Any reason why this is happening?

Relevant code:

driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF']
asxcodes = []
for d in data:
    try:
       asxcode = driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).get_attribute("href")
       asxcodes.append(asxcode)
    except:
        pass

Entire code:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True,"download.default_directory" : r"C:\Users\Harrison Pollock\Desktop\The Smarts\Becoming a Substantial Holder"}
chromeOptions.add_experimental_option("prefs",prefs)
chromedriver = r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe",chrome_options=chromeOptions)
driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF'
asxcodes = []
for d in data:
    try:
       asxcode = driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).get_attribute("href")
       asxcodes.append(asxcode)
    except:
        pass
for asxcode in asxcodes:
    driver.get(asxcode)
    WebDriverWait(driver, 15).until(EC.element_to_be_clickable((By.XPATH, "//input[@value='Agree and proceed']"))).click()
    time.sleep(10)  

回答1:


Instead of getting all href value and then iterate could you try something like click on each link and then click for download.

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import time
chromeOptions = webdriver.ChromeOptions()
prefs = {"plugins.always_open_pdf_externally": True,"download.default_directory" : r"C:\Users\Harrison Pollock\Desktop\The Smarts\Becoming a Substantial Holder"}
chromeOptions.add_experimental_option("prefs",prefs)
chromedriver = r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(executable_path=r"C:\Users\Harrison Pollock\Downloads\Python\chromedriver_win32\chromedriver.exe",chrome_options=chromeOptions)
driver.get("https://www.asx.com.au/asx/statistics/prevBusDayAnns.do")
data = ['TWD', 'GEM', 'AT1','TKF','GDF']
asxcodes = []
for d in data:
    try:
       driver.find_element_by_xpath("//table//tr//td[text()='{}']/following-sibling::td[3]/a[contains(.,'{}')]".format(d,"Becoming a substantial holder")).click()
       WebDriverWait(driver,5).until(EC.number_of_windows_to_be(2))
       driver.switch_to.window(driver.window_handles[-1])
       WebDriverWait(driver,5).until(EC.element_to_be_clickable((By.XPATH, "//input[@value='Agree and proceed']"))).click()
       time.sleep(10)
       driver.close()
       driver.switch_to.window(driver.window_handles[-1])
    except:
        driver.switch_to.window(driver.window_handles[-1])
        continue

Hope this logic helps.



来源:https://stackoverflow.com/questions/61320949/for-loop-keeps-iterating-over-first-piece-of-data-in-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!