from mechanize import Browser
br = Browser()
br.open(\'http://somewebpage\')
html = br.response().readlines()
for line in html:
print line
When p
If you want to strip all HTML tags the easiest way I found is using BeautifulSoup:
from bs4 import BeautifulSoup # Or from BeautifulSoup import BeautifulSoup
def stripHtmlTags(htmlTxt):
if htmlTxt is None:
return None
else:
return ''.join(BeautifulSoup(htmlTxt).findAll(text=True))
I tried the code of the accepted answer but I was getting "RuntimeError: maximum recursion depth exceeded", which didn't happen with the above block of code.