Python/Selenium - how to loop through hrefs in <li>?

我怕爱的太早我们不能终老 提交于 2020-04-30 04:22:07

问题


Web URL: https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times

Hi folks, I want to parse the HTML as below:

I want to get all hrefs within the < li > elements and the highlighted text. I tried the code

elementList = driver.find_element_by_class_name('block-wysiwyg').find_elements_by_tag_name("li")
for i in range(len(elementList)):
    driver.find_element_by_class_name('blcokwysiwyg').find_elements_by_tag_name("li").get_attribute("href")

But the block returned none.

Can anyone please help me with the above code? Thank you for the help!


回答1:


I suppose it will fetch you the required content.

import requests
from bs4 import BeautifulSoup

link = 'https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times'

r = requests.get(link)
soup = BeautifulSoup(r.text,"html.parser")
for item in soup.select(".block-wysiwyg li"):
    item_text = item.get_text(strip=True)
    item_link = item.select_one("a[href]").get("href")
    print(item_text,item_link)



回答2:


Try is this way:

coronas = driver.find_element_by_xpath("//div[@class='block-wysiwyg']/ul/li")
hr = coronas.find_element_by_xpath('./a')
print(coronas.text)
print(hr.get_attribute('href'))

Output:

The coronavirus is touching the lives of all Americans, but race, age, and income play a big role in the exact ways the virus — and the stalled economy — are affecting people. Here's what that means.
https://www.ipsos.com/en-us/america-under-coronavirus


来源:https://stackoverflow.com/questions/61235160/python-selenium-how-to-loop-through-hrefs-in-li

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!