Remove all style, scripts, and html tags from an html page

后端 未结 5 1970
面向向阳花
面向向阳花 2020-12-31 07:13

Here is what I have so far:

from bs4 import BeautifulSoup

def cleanme(html):
    soup = BeautifulSoup(html) # create a new bs4 object from the html data loa         


        
5条回答
  •  不知归路
    2020-12-31 07:36

    If you want a quick and dirty solution you ca use:

    re.sub(r'<[^>]*?>', '', value)
    

    To make an equivalent of strip_tags in php. Is that what you want?

提交回复
热议问题