I\'m using lxml\'s iterparse to parse some big XML files (3-5Gig). Since some of these files have invalid characters a lxml.etree.XMLSyntaxError is thr
When you say invalid characters, do you mean unicode characters? If so you can try
lxml.etree.XMLParser(encoding='UTF-8', recover=True)
If you mean malformed XML then this obviously won't work. If you can post your traceback, we can see the nature of the XMLSyntaxError which will provide more information.