Select all div siblings by using BeautifulSoup

纵饮孤独 提交于 2019-12-23 01:50:08

问题


I have an html file which has a structure like the following:

<div>
</div

<div>
</div>

<div>
  <div>
  </div>
  <div>
  </div>
  <div>
  </div>
<div>

<div>
  <div>
  </div>
</div>

I would like to select all the siblings div without selecting nested div in the third and fourth block. If I use find_all() I get all the divs.


回答1:


You can find direct children of the parent element:

soup.select('body > div')

to get all div elements under the top-level body tag.

You could also find the first div, then grab all matching siblings with Element.find_next_siblings():

first_div = soup.find('div')
all_divs = [first_div] + first_div.find_next_siblings('div')

Or you could use the element.children generator and filter those:

all_divs = (elem for elem in top_level.children if getattr(elem, 'name', None) == 'div')

where top_level is the element containing these div elements directly.



来源:https://stackoverflow.com/questions/27826883/select-all-div-siblings-by-using-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!