How can I get all the text content of an XML document, as a single string - like this Ruby/hpricot example but using Python.
I\'d like to replace XML tags with a sin
I really like BeautifulSoup, and would rather not use regex on HTML if we can avoid it.
Adapted from: [this StackOverflow Answer], [BeautifulSoup documentation]
from bs4 import BeautifulSoup
soup = BeautifulSoup(txt) # txt is simply the a string with your XML file
pageText = soup.findAll(text=True)
print ' '.join(pageText)
Though of course, you can (and should) use BeautifulSoup to navigate the page for what you are looking for.