BeautifulSoup partial div class matching

社会主义新天地 提交于 2020-08-05 05:15:01

问题


I need to fetch milestone information from Github by scraping. The milestone information is embedded in 2 types of div classes: table-list-item milestone notdue and table-list-item milestone.

How can I retrieve the information contained in both classes?

I have: milestones = soup.find_all('div', {'class': 'table-list-item milestone'}) but this line returns empty list for table-list-item milestone notdue

Right now I am doing the following (ugly hack):

milestones = soup.find_all('div', {'class':'table-list-item milestone'})
milestones.extend(soup.findAll('div', {'class': 'table-list-item milestone notdue'}))

Is there any elegant solution for this?

As per this question, BeautifulSoup is supposed to return all matching ones. My issue is exactly opposite!


回答1:


soup.find_all('div', {'class': 'milestone'})

or use CSS selector:

soup.select('.milestone')

in bs4, class is Multi-valued attributes:

it's store in list:[table-list-item, milestone, notdue] and [table-list-item, milestone]

what you need to do is find the shared value,like milestone



来源:https://stackoverflow.com/questions/42708837/beautifulsoup-partial-div-class-matching

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!