Use Python Element Tree to parse xml in ASCII text file [closed]

五迷三道 提交于 2019-12-12 06:48:04

问题


I have ASCII text files that contain XML sections in them. I try the following basic commands to open the file, but get an error:

import xml.etree.ElementTree as ET
tree = ET.parse('data_file.txt')

Is there a way I can still use Element Tree to be able to parse the XML sections out of the text file?


回答1:


You cannot use ElementTree to parse a file that isn't in its entirety well-formed XML. If there is text content before or after the root element of the XML document, XML parsing will fail, as it will if there are any other infractions against well-formedness.

More generally, standards-compliant XML parsers can parse only well-formed XML. So your scenario is actually fairly common.

One approach would be to write a program that processes the file and attempts to find the XML embedded in the other content, and that handles that part of the file with ElementTree. If your XML content is simple, this is quite feasible. If it's complex, or if there is more than one XML document embedded in the text file, it gets a little more challenging, but it may still be doable.



来源:https://stackoverflow.com/questions/48131838/use-python-element-tree-to-parse-xml-in-ascii-text-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!