Figuring out where CDATA is in lxml element?
问题 I need to parse and rebuild a file format used by a parser which speaks a language that can only charitably be described as XML. I realize that standards-compliant XML doesn't care about either the CDATA or the whitespace, but unfortunately this application demands that I care about both... I'm using lxml.etree because it's pretty good at preserving CDATA. For example: s = ''' <root> <item> <![CDATA[whatever]]> </item> </root>''' import lxml.etree as et et.fromstring(s, et.XMLParser(strip