I have XML that I need to parse but have no control over the creation of. Unfortunately it\'s not very strict XML and contains things like:
This
Use libraries such as tidy or tagsoup.
tidy
tagsoup
TagSoup, a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short.