XML encoding setup and specific charsets

谁说胖子不能爱 提交于 2019-12-11 01:34:13

问题


I have to read a big XML document (gigabytes) which has &#XX; charset, where XX is less than 31. Usually, I am aware that these charsets (<32) are reserved for ASCII device control.

The author of the file decided to use this charset inside the text and to change it is something that is out of my hands.

I have tried different xml encoding scheme declarations, beyond UTF-8, when declaring the header of xml file: <?xml version="1.0" encoding ="UTF-8"?>, but have no success when trying to render it in my XML parser.

To make the problem reproducible and clear, consider the simple xml file below (which, for example, has the charset after the name Fred):

<?xml version="1.0" encoding ="UTF-8"?> 
<TABLE> 
 <GRADES> 
 <STUDENT> Fred &#01; </STUDENT> 
 <TEST1> 1 </TEST1> 
 <TEST2> 2 </TEST2> 
 <FINAL> 3 </FINAL> 
 </GRADES> 
 <GRADES> 
 <STUDENT> Wilma </STUDENT> 
 <TEST1> 1 </TEST1> 
 <TEST2> 2 </TEST2> 
 <FINAL> 3 </FINAL> 
 </GRADES> 
</TABLE>

When I read these file in different browsers, I get the error:

error on line 4 at column 22: xmlParseCharRef: invalid xmlChar value 1

I know that a possible solution is to pre-process the original file, finding and replacing the chars that causes the error, but does anybody know any other way to workaround this problem? Is there any specific encoding which supports &#XX; charset (XX < 32) ?


回答1:


Not all characters are legal in XML 1.0. (http://www.w3.org/TR/REC-xml/#charsets)

If your tools support XML 1.1, switching them into that mode will allow some of the previously forbidden characters (http://www.w3.org/TR/xml11/#charsets)

The usual solution is not to try to put control characters into an XML document. Instead, encode the binary data as hex or base64 or some other text representation, and let the application code convert it back to binary when needed.



来源:https://stackoverflow.com/questions/19961421/xml-encoding-setup-and-specific-charsets

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!