Remove all html in python?

后端 未结 3 879
情话喂你
情话喂你 2020-12-10 23:12

Is there a way to remove/escape html tags using lxml.html and not beautifulsoup which has some xss issues? I tried using cleaner, but i want to remove all html.

3条回答
  •  北荒
    北荒 (楼主)
    2020-12-11 00:03

    I believe that, this code can help you:

    from lxml.html.clean import Cleaner
    
    html_text = "HelloText"
    cleaner = Cleaner(allow_tags=[''], remove_unknown_tags=False)
    cleaned_text = cleaner.clean_html(html_text)
    

提交回复
热议问题