Python/Selenium - how to loop through hrefs in ?

问题

Web URL: https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times

Hi folks, I want to parse the HTML as below:

I want to get all hrefs within the < li > elements and the highlighted text. I tried the code

elementList = driver.find_element_by_class_name('block-wysiwyg').find_elements_by_tag_name("li")
for i in range(len(elementList)):
    driver.find_element_by_class_name('blcokwysiwyg').find_elements_by_tag_name("li").get_attribute("href")

But the block returned none.

Can anyone please help me with the above code? Thank you for the help!

回答1:

I suppose it will fetch you the required content.

import requests
from bs4 import BeautifulSoup

link = 'https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times'

r = requests.get(link)
soup = BeautifulSoup(r.text,"html.parser")
for item in soup.select(".block-wysiwyg li"):
    item_text = item.get_text(strip=True)
    item_link = item.select_one("a[href]").get("href")
    print(item_text,item_link)

回答2:

Try is this way:

coronas = driver.find_element_by_xpath("//div[@class='block-wysiwyg']/ul/li")
hr = coronas.find_element_by_xpath('./a')
print(coronas.text)
print(hr.get_attribute('href'))

Output:

The coronavirus is touching the lives of all Americans, but race, age, and income play a big role in the exact ways the virus — and the stalled economy — are affecting people. Here's what that means.
https://www.ipsos.com/en-us/america-under-coronavirus

来源：https://stackoverflow.com/questions/61235160/python-selenium-how-to-loop-through-hrefs-in-li

标签

python-3.x

selenium-webdriver

web-scraping

Python/Selenium - how to loop through hrefs in <li>?

问题

回答1:

回答2: