Python Beautifulsoup Find_all except

风流意气都作罢 提交于 2019-12-30 09:30:18

问题


I'm struggling to find a simple to solve this problem and hope you might be able to help.

I've been using Beautifulsoup's find all and trying some regex to find all the items except the 'emptyLine' line in the html below:

<div class="product_item0 ">...</div>
<div class="product_item1 ">...</div>
<div class="product_item2 ">...</div>
<div class="product_item0 ">...</div>
<div class="product_item1 ">...</div>
<div class="product_item2 ">...</div>
<div class="product_item0 ">...</div>
<div class="product_item1 last">...</div>
<div class="product_item2 emptyItem">...</div>

Is there a simple way to find all the items except one including the 'emptyItem'?


回答1:


Just skip elements containing the emptyItem class. Working sample:

from bs4 import BeautifulSoup

data = """
<div>
    <div class="product_item0">test0</div>
    <div class="product_item1">test1</div>
    <div class="product_item2">test2</div>
    <div class="product_item2 emptyItem">empty</div>
</div>
"""

soup = BeautifulSoup(data, "html.parser")

for elm in soup.select("div[class^=product_item]"):
    if "emptyItem" in elm["class"]:  # skip elements having emptyItem class
        continue

    print(elm.get_text())

Prints:

test0
test1
test2

Note that the div[class^=product_item] is a CSS selector that would match all div elements with a class starting with product_item.



来源:https://stackoverflow.com/questions/35115417/python-beautifulsoup-find-all-except

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!