How to handle special characters like  when Serialize/Deserialize xml object?

我是研究僧i 提交于 2020-01-01 16:27:11

问题


I have some biz objects to store the customer names, sometimes the name contains some special characters like , . These names are imported from 3rd party, and I cannot delete the funny chars from the source.

The application will serialize/deserialize the customer object by XmlSerializer, but the strange thing here is when I serialize the name with special chars, there are no errors, and the result will be like this <Name>Jim &#2;<Name>. But when I deserialize the output xml, I will get an exception There is an error in XML document (3, 15).

So how to handle these special characters in my application? Thanks!

Attached some test code:

    public class Customer
    {
        public string Name;
    }

    class Program
    {
        public static T DeserializeFromXml<T>(string settings) where T : class
        {
            var serializer = new XmlSerializer(typeof(T));
            var reader = new StringReader(settings);
            var result = serializer.Deserialize(reader);
            return result as T;
        }

        public static string SerializeToXml<T>(T settings)
        {
            var serializer = new XmlSerializer(typeof(T));
            var writer = new StringWriter();
            serializer.Serialize(writer, settings);
            return writer.ToString();
        }

        static void Main(string[] args)
        {
            var str = new char[] { 'J', 'i', 'm', (char)2 };
            var customer = new Customer { Name = new string(str) };

            var output = SerializeToXml(customer);

            var obj = DeserializeFromXml<Customer>(output);
        }
    }

回答1:


I don't have a solution for your question, but here is the background info.

The string &#2; is XML for saying the character with value of '2'. According to XML 1.0 this is not a valid character. See http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Char.

The .Net CLR is consistent. The Xml serialiser will happily generated XML documents with illegal character. However the deserialiser will throw when an illegal character is encountered.

See http://msdn.microsoft.com/en-us/library/aa302290.aspx for more details.

XML 1.1 relaxes the restriction. But .Net only support XML 1.0.



来源:https://stackoverflow.com/questions/17357380/how-to-handle-special-characters-like-2-when-serialize-deserialize-xml-object

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!