Regular expression to match closing HTML tags

前端 未结 4 1668
清歌不尽
清歌不尽 2020-12-10 08:33

I\'m working on a small Python script to clean up HTML documents. It works by accepting a list of tags to KEEP and then parsing through the HTML code trashing tags that are

4条回答
  •  长情又很酷
    2020-12-10 08:50

    1. Read:

      • RegEx match open tags except XHTML self-contained tags
      • Can you provide some examples of why it is hard to parse XML and HTML with a regex?
    2. Repent.

    3. Use a real HTML parser, like BeautifulSoup.

提交回复
热议问题