问题
I'm trying to scrape this website using Python and Selenium, it requires you to select a date from drop-down box then click search to view the planning applications.
URL: https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList.
I have the code working to select the first index of the drop-down box and press search. How would I open multiple windows for all the date options in the drop-down box or go through them one by one so I can scrape it?
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('/Users/weaabduljamac/Downloads/chromedriver',
chrome_options=options)
url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
driver.get(url)
select = Select(driver.find_element_by_xpath('//*[@id="selWeek"]'))
select.select_by_index(1)
button = driver.find_element_by_id('csbtnSearch')
button.click()
app_numbers = driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text
print(app_numbers)
Drop-down box HTML:
<select class="formitem" id="selWeek" name="selWeek">
<option selected="selected" value="2018,31">Week commencing Monday 30 July 2018</option>
<option value="2018,30">Week commencing Monday 23 July 2018</option>
<option value="2018,29">Week commencing Monday 16 July 2018</option>
<option value="2018,28">Week commencing Monday 9 July 2018</option>
<option value="2018,27">Week commencing Monday 2 July 2018</option>
<option value="2018,26">Week commencing Monday 25 June 2018</option>
<option value="2018,25">Week commencing Monday 18 June 2018</option>
<option value="2018,24">Week commencing Monday 11 June 2018</option>
<option value="2018,23">Week commencing Monday 4 June 2018</option>
<option value="2018,22">Week commencing Monday 28 May 2018</option>
</select>
回答1:
As per your question you won't be able to open multiple windows for different drop-down options as the <options>
tags doesn't contains any href
attribute. They will always render the new page in the same browser window.
However to select a date from the Dropdown and then click()
Search to view the planning applications you can use the following solution:
Code Block:
from selenium import webdriver from selenium.webdriver.support.ui import Select from selenium.webdriver.chrome.options import Options options = Options() options.add_argument('--headless') options.add_argument("start-maximized") options.add_argument('disable-infobars') driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe') url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList' driver.get(url) select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']")) list_options = select.options for item in range(len(list_options)): select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']")) select.select_by_index(str(item)) driver.find_element_by_css_selector("input.formbutton#csbtnSearch").click() print(driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text) driver.get(url) driver.quit()
Console Output:
18/06760/FUL 18/07187/LBC 18/06843/FUL 18/06705/FUL 18/06449/FUL 18/05534/FUL 18/06030/DEM 18/05784/FUL 18/05914/LBC 18/05241/FUL
trivia
To scrape all the links you need to replace:
find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')
with:
find_elements_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')
回答2:
I am pretty sure this is not possible and you would have to loop through the options and storing the data somewhere and then appending new data from each dropdown.
hope this helps.
回答3:
You can perform click + ctrl
on the search button to open the link in new window, scrap the data, and return to first page to select next option
# original window to switch back
window_before = driver.window_handles[0]
select = Select(driver.find_element_by_id('selWeek'))
options = select.options
for option in options :
select.select_by_visible_text(option.text)
# click to open link in new window
button = driver.find_element_by_id('csbtnSearch')
ActionChains(driver).key_down(Keys.CONTROL).click(button).key_up(Keys.CONTROL).perform()
# switch to new window and scrap the data
driver.switch_to_window(driver.window_handles[1])
# scrap the data
# return to original window
driver.close()
driver.switch_to_window(window_before)
来源:https://stackoverflow.com/questions/51723000/how-to-open-the-option-items-of-a-select-tag-dropdown-in-different-tabs-window