Entity references and lxml

浪子不回头ぞ 提交于 2019-12-29 07:34:12

问题


Here's the code I have:

from cStringIO import StringIO
from lxml import etree

xml = StringIO('''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY test "This is a test">
]>
<root>
  <sub>&test;</sub>
</root>''')

d1 = etree.parse(xml)
print '%r' % d1.find('/sub').text

parser = etree.XMLParser(resolve_entities=False)
d2 = etree.parse(xml, parser=parser)
print '%r' % d2.find('/sub').text

Here's the output:

'This is a test'
None

How do I get lxml to give me '&test;', i.e., the raw entity reference?


回答1:


The "unresolved" Entity is left as child node of the element node sub

>>> print d2.find('/sub')[0]
&test;
>>> d2.find('/sub').getchildren()
[&test;]


来源:https://stackoverflow.com/questions/2524299/entity-references-and-lxml

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!