问题
I'm very new to python and trying to learn webscraping. Following a tutorial, I'm trying to extract a price from a website but nothing is being printed. What is wrong with my code?
from selenium import webdriver
chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = driver.find_elements_by_class_name("fare-and-services-flight-select-fare-value ng-isolate-scope")
for post in price:
print(post.text)
回答1:
To print the first title you have to induce WebDriverWait for the desired visibility_of_element_located() and you can use either of the following Locator Strategies:
Using
CSS_SELECTOR:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "isa-flight-select button:first-child span.fare-and-services-flight-select-fare-value.ng-isolate-scope"))).get_attribute("innerHTML"))Using
XPATH:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//isa-flight-select//following::button[contains(@class, 'button')]//span[@class='fare-and-services-flight-select-fare-value ng-isolate-scope']"))).text)Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as ECConsole Output of two back to back execution:
475
You can find a relevant discussion in How to retrieve the title attribute through Selenium using Python?
Outro
As per the documentation:
- get_attribute() method
Gets the given attribute or property of the element. - text attribute returns
The text of the element. - Difference between text and innerHTML using Selenium
回答2:
The first reason for that is because the webpage you are trying to scrap uses javascript to load the HTML so you will need to wait until that element is present to get it using selenium's WebDriverWait
The second reason is that the find_elements_by_class_name method only accepts one class so you would need to either use find_elements_by_css_selector or find_elements_by_xpath
this is how your code should look
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = WebDriverWait(driver, 10).until(
lambda x: x.find_elements_by_css_selector(".currency-value.fare-value.ng-scope.ng-isolate-scope"))
for post in price:
print(post.get_attribute("innerText"))
来源:https://stackoverflow.com/questions/57739736/how-to-grab-the-price-information-from-flight-reservation-site-https-reservati