How do I get properly escaped XML in python etree untouched?

流过昼夜 提交于 2019-12-01 19:05:11

问题


I'm using python version 2.7.3.

test.txt:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <test>The tag &lt;StackOverflow&gt; is good to bring up at parties.</test>
</root>

Result:

>>> import xml.etree.ElementTree as ET
>>> e = ET.parse('test.txt')
>>> root = e.getroot()
>>> print root.find('test').text
The tag <StackOverflow> is good to bring up at parties.

As you can see, the parser must have changed the &lt;'s to <'s etc.

What I'd like to see:

The tag &lt;StackOverflow&gt; is good to bring up at parties.

Untouched, raw text. Sometimes I really like it raw. Uncooked.

I'd like to use this text as-is for display within HTML, therefore I don't want an XML parser to mess with it.

Do I have to re-escape each string or can there be another way?


回答1:


import xml.etree.ElementTree as ET
e = ET.parse('test.txt')
root = e.getroot()
print(ET.tostring(root.find('test')))

yields

<test>The tag &lt;StackOverflow&gt; is good to bring up at parties.</test>

Alternatively, you could escape the text with saxutils.escape:

import xml.sax.saxutils as saxutils
print(saxutils.escape(root.find('test').text))

yields

The tag &lt;StackOverflow&gt; is good to bring up at parties.


来源:https://stackoverflow.com/questions/23516664/how-do-i-get-properly-escaped-xml-in-python-etree-untouched

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!