I\'m working on a small Python script to clean up HTML documents. It works by accepting a list of tags to KEEP and then parsing through the HTML code trashing tags that are
You may also consider using the html parser that is built into python (Documentation for Python 2 and Python 3)
This will help you home in on the specific area of the HTML Document you would like to work on - and use regular expressions on it.