I have downloaded a page using urlopen. How do I remove all html tags from it? Is there any regexp to replace all <*> tags?
You could use html2text which is supposed to make a readable text equivalent from an HTML source (programatically with Python or as a command-line tool). Thus I may extrapolate your needs from your question...