问题
I have ASCII text files that contain XML sections in them. I try the following basic commands to open the file, but get an error:
import xml.etree.ElementTree as ET
tree = ET.parse('data_file.txt')
Is there a way I can still use Element Tree to be able to parse the XML sections out of the text file?
回答1:
You cannot use ElementTree to parse a file that isn't in its entirety well-formed XML. If there is text content before or after the root element of the XML document, XML parsing will fail, as it will if there are any other infractions against well-formedness.
More generally, standards-compliant XML parsers can parse only well-formed XML. So your scenario is actually fairly common.
One approach would be to write a program that processes the file and attempts to find the XML embedded in the other content, and that handles that part of the file with ElementTree. If your XML content is simple, this is quite feasible. If it's complex, or if there is more than one XML document embedded in the text file, it gets a little more challenging, but it may still be doable.
来源:https://stackoverflow.com/questions/48131838/use-python-element-tree-to-parse-xml-in-ascii-text-file