How can I use BeautifulSoup to get deeply nested div values?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-23 02:19:27

问题


I need to get the values of deeply nested <span> elements in a DOM structure that looks like this:

<div class="panda">
    <div class="that">
        <ul class="foo">
            <li class="bar">
                <div class="hi">
                    <p class="bye">
                        <span class="cheese">Cheddar</span>

The problem with

soup.findAll("span", {"class": "cheese"})

is that there are hundreds of span elements on the page with class "cheese" so I need to filter them by class "panda". I need to get a list of values like ["Cheddar", "Parmesan", "Swiss"]


回答1:


Use css selectors:

[e.get_text() for e in soup.select('.panda .cheese')]

Or, if you prefer find_all:

# Calling a soup or tag is the same as find_all

[e.get_text() for panda in soup('div', {'class': 'panda'}) 
              for e in panda('span', {'class': 'cheese'})]


来源:https://stackoverflow.com/questions/27355051/how-can-i-use-beautifulsoup-to-get-deeply-nested-div-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!