Non-unicode XML representation

こ雲淡風輕ζ 提交于 2019-12-11 02:23:13

问题


I have xml where some of the element values are unicode characters. Is it possible to represent this in an ANSI encoding?

E.g.

<?xml version="1.0" encoding="utf-8"?>
<xml>
<value>受</value>
</xml>

to

<?xml version="1.0" encoding="Windows-1252"?>
<xml>
<value>&#27544;</value>
</xml>

I deserialize the XML and then attempt to serialize it using XmlTextWriter specifying the Default encoding (Default is Windows-1252). All the unicode characters end up as question marks. I'm using VS 2008, C# 3.5


回答1:


Okay I tested it with the following code:

 string xml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><xml><value>受</value></xml>";

 XmlWriterSettings settings = new XmlWriterSettings { Encoding = Encoding.Default };
 MemoryStream ms = new MemoryStream();
 using (XmlWriter writer = XmlTextWriter.Create(ms, settings))
      XElement.Parse(xml).WriteTo(writer);

 string value = Encoding.Default.GetString(ms.ToArray());

And it correctly escaped the unicode character thus:

<?xml version="1.0" encoding="Windows-1252"?><xml><value>&#x53D7;</value></xml>

I must be doing something wrong somewhere else. Thanks for the help.




回答2:


If I understand the question, then yes. You just need a ; after the 27544:

<?xml version="1.0" encoding="Windows-1252"?>
<xml>
<value>&#27544;</value>
</xml>

Or are you wondering how to generate this XML programmatically? If so, what language/environment are you working in?



来源:https://stackoverflow.com/questions/82008/non-unicode-xml-representation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!