Beautifulsoup 4: Remove comment tag and its content

前端 未结 3 1394
挽巷
挽巷 2020-12-31 04:35

So the page that I\'m scrapping contains these html codes. How do I remove the comment tag along with its content with bs4?

3条回答
  •  自闭症患者
    2020-12-31 05:26

    From this answer If you are looking for solution in BeautifulSoup version 3 BS3 Docs - Comment

    soup = BeautifulSoup("""Hello! """)
    comment = soup.find(text=re.compile("if"))
    Comment=comment.__class__
    for element in soup(text=lambda text: isinstance(text, Comment)):
        element.extract()
    print soup.prettify()
    

提交回复
热议问题