问题
I need to get the values of deeply nested <span> elements in a DOM structure that looks like this:
<div class="panda">
<div class="that">
<ul class="foo">
<li class="bar">
<div class="hi">
<p class="bye">
<span class="cheese">Cheddar</span>
The problem with
soup.findAll("span", {"class": "cheese"})
is that there are hundreds of span elements on the page with class "cheese" so I need to filter them by class "panda". I need to get a list of values like ["Cheddar", "Parmesan", "Swiss"]
回答1:
Use css selectors:
[e.get_text() for e in soup.select('.panda .cheese')]
Or, if you prefer find_all:
# Calling a soup or tag is the same as find_all
[e.get_text() for panda in soup('div', {'class': 'panda'})
for e in panda('span', {'class': 'cheese'})]
来源:https://stackoverflow.com/questions/27355051/how-can-i-use-beautifulsoup-to-get-deeply-nested-div-values