fast way to deserialize XML with special characters

帅比萌擦擦* 提交于 2019-12-17 18:59:51

问题


I am looking for fast way to deserialize xml, that has special characters in it like ö.

I was using XMLReader and it fails to deserialze such characters.

Any suggestion?

EDIT: I am using C#. Code is as follows:

XElement element =.. //has the xml
XmlSerializer serializer =   new XmlSerializer(typeof(MyType));
XmlReader reader = element.CreateReader();
Object o= serializer.Deserialize(reader);

回答1:


I'd guess you're having an encoding issue, not in the XMLReader but with the XmlSerializer.

You could use the XmlTextWriter and UTF8 encoding with the XmlSerializer like in the following snippet (see the generic methods below for a way nicer implementation of it). Works just fine with umlauts (äöü) and other special characters.

class Program
{
    static void Main(string[] args)
    {
        SpecialCharacters specialCharacters = new SpecialCharacters { Umlaute = "äüö" };

        // serialize object to xml

        MemoryStream memoryStreamSerialize = new MemoryStream();
        XmlSerializer xmlSerializerSerialize = new XmlSerializer(typeof(SpecialCharacters));
        XmlTextWriter xmlTextWriterSerialize = new XmlTextWriter(memoryStreamSerialize, Encoding.UTF8);

        xmlSerializerSerialize.Serialize(xmlTextWriterSerialize, specialCharacters);
        memoryStreamSerialize = (MemoryStream)xmlTextWriterSerialize.BaseStream;

        // converts a byte array of unicode values (UTF-8 enabled) to a string
        UTF8Encoding encodingSerialize = new UTF8Encoding();
        string serializedXml = encodingSerialize.GetString(memoryStreamSerialize.ToArray());

        xmlTextWriterSerialize.Close();
        memoryStreamSerialize.Close();
        memoryStreamSerialize.Dispose();

        // deserialize xml to object

        // converts a string to a UTF-8 byte array.
        UTF8Encoding encodingDeserialize = new UTF8Encoding();
        byte[] byteArray = encodingDeserialize.GetBytes(serializedXml);

        using (MemoryStream memoryStreamDeserialize = new MemoryStream(byteArray))
        {
            XmlSerializer xmlSerializerDeserialize = new XmlSerializer(typeof(SpecialCharacters));
            XmlTextWriter xmlTextWriterDeserialize = new XmlTextWriter(memoryStreamDeserialize, Encoding.UTF8);

            SpecialCharacters deserializedObject = (SpecialCharacters)xmlSerializerDeserialize.Deserialize(xmlTextWriterDeserialize.BaseStream);
        }
    }
}

[Serializable]
public class SpecialCharacters
{
    public string Umlaute { get; set; }
}

I personally use the follwing generic methods to serialize and deserialize XML and objects and haven't had any performance or encoding issues yet.

public static string SerializeObjectToXml<T>(T obj)
{
    MemoryStream memoryStream = new MemoryStream();
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
    XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);

    xmlSerializer.Serialize(xmlTextWriter, obj);
    memoryStream = (MemoryStream)xmlTextWriter.BaseStream;

    string xmlString = ByteArrayToStringUtf8(memoryStream.ToArray());

    xmlTextWriter.Close();
    memoryStream.Close();
    memoryStream.Dispose();

    return xmlString;
}

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(StringToByteArrayUtf8(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));

        using (StreamReader xmlStreamReader = new StreamReader(memoryStream, Encoding.UTF8))
        {
            return (T)xmlSerializer.Deserialize(xmlStreamReader);
        }
    }
}

public static string ByteArrayToStringUtf8(byte[] value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetString(value);
}

public static byte[] StringToByteArrayUtf8(string value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(value);
}



回答2:


What works for me is similar to what @martin-buberl suggested:

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
        StreamReader reader = new StreamReader(memoryStream, Encoding.UTF8);
        return (T)xmlSerializer.Deserialize(reader);
    }
}



回答3:


    [XmlElement(ElementName = "Profiles")]
    //public ProfilesType[] Profiles { get; set; }
    public Profiles Profiles { get; set; }

Tried something above?

I haven't checked, but this sprang to mind. I managed to de+serialize Data that has åäö etc. U are not talking about tagnames?



来源:https://stackoverflow.com/questions/4899872/fast-way-to-deserialize-xml-with-special-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!