How to find element based on text ignore child tags in beautifulsoup

大城市里の小女人 提交于 2019-12-23 12:01:39

问题


I am looking for a solution using Python and BeautifulSoup to find an element based on the inside text. For example:

<div> <b>Ignore this text</b>Find based on this text </div>

How can I find this div? Thanks for you helps!


回答1:


You can use .find with the text argument and then use findParent to the parent element.

Ex:

from bs4 import BeautifulSoup
s="""<div> <b>Ignore this text</b>Find based on this text </div>"""
soup = BeautifulSoup(s, 'html.parser')
t = soup.find(text="Find based on this text ") 
print(t.findParent())

Output:

<div> <b>Ignore this text</b>Find based on this text </div>



回答2:


try it , it is like example but it works

from bs4 import BeautifulSoup
html="""
<div> <b>Ignore this text</b>Find based on this text </div>
"""

soup = BeautifulSoup(html, 'lxml')                                                                                                                                                

s = soup.find('div')

for child in s.find_all('b'):
    child.decompose()

print(s.get_text())

Output

 Find based on this text 


来源:https://stackoverflow.com/questions/50459203/how-to-find-element-based-on-text-ignore-child-tags-in-beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!