Use C# XmlSerializer to write in chunks for large sets of objects to avoid Out of Memory

半世苍凉 提交于 2019-12-20 01:40:01

问题


I like how XmlSerialize works, so simple and elegant and with attributes =p However, I am running into Out of Memory issue while building up a collection of all my objects prior to serializing to xml file.

I am populating an object from a SQL database and intend to write the object out to XML using XmlSerialize. It works great for small subsets but if I try to grab all the objects from the DB I reach an Out of Memory exception.

Is there some ability of XmlSerialize that would allow me to grab batches of 100 objects out of the database, then write them, grab the next batch of 100 objects and append to the xml?

I am hoping I dont have to bust out into XmlDocument or something that requires more manual coding efforts...


回答1:


XmlSerializer can, in fact, stream enumerable data in and out when serializing. It has special handling for a class that implements IEnumerable<T>. From the docs:

The XmlSerializer gives special treatment to classes that implement IEnumerable or ICollection. A class that implements IEnumerable must implement a public Add method that takes a single parameter. The Add method's parameter must be of the same type as is returned from the Current property on the value returned from GetEnumerator, or one of that type's bases.

When serializing such classes, XmlSerializer simply iterates through the enumerable writing each current value to the output stream. It does not load the entire enumerable into a list first. Thus, if you have some Linq query that dynamically pages in results of type T from a database in chunks (example here), you can serialize all of them out without loading them all at once using the following wrapper:

// Proxy class for any enumerable with the requisite `Add` methods.
public class EnumerableProxy<T> : IEnumerable<T>
{
    [XmlIgnore]
    public IEnumerable<T> BaseEnumerable { get; set; }

    public void Add(T obj)
    {
        throw new NotImplementedException();
    }

    #region IEnumerable<T> Members

    public IEnumerator<T> GetEnumerator()
    {
        if (BaseEnumerable == null)
            return Enumerable.Empty<T>().GetEnumerator();
        return BaseEnumerable.GetEnumerator();
    }

    #endregion

    #region IEnumerable Members

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    #endregion
}

Note this class is only useful for serializing, not deserializing. Here is an example of how to use it:

public class RootObject<T>
{
    [XmlIgnore]
    public IEnumerable<T> Results { get; set; }

    [XmlArray("Results")]
    public EnumerableProxy<T> ResultsProxy { 
        get
        {
            return new EnumerableProxy<T> { BaseEnumerable = Results };
        }
        set
        {
            throw new NotImplementedException();
        }
    }
}

public class TestClass
{
    XmlWriter xmlWriter;
    TextWriter textWriter;

    public void Test()
    {
        try
        {
            var root = new RootObject<int>();
            root.Results = GetResults();

            using (textWriter = new StringWriter())
            {
                var settings = new XmlWriterSettings { Indent = true, IndentChars = "  " };
                using (xmlWriter = XmlWriter.Create(textWriter, settings))
                {
                    (new XmlSerializer(root.GetType())).Serialize(xmlWriter, root);
                }
                var xml = textWriter.ToString();
                Debug.WriteLine(xml);
            }
        }
        finally
        {
            xmlWriter = null;
            textWriter = null;
        }
    }

    IEnumerable<int> GetResults()
    {
        foreach (var i in Enumerable.Range(0, 1000))
        {
            if (i > 0 && (i % 500) == 0)
            {
                HalfwayPoint();
            }
            yield return i;
        }
    }

    private void HalfwayPoint()
    {
        if (xmlWriter != null)
        {
            xmlWriter.Flush();
            var xml = textWriter.ToString();
            Debug.WriteLine(xml);
        }
    }
}

If you set a break in HalfwayPoint(), you will see that half the XML has already been written out while still iterating through the enumerable. (Of course, I'm just writing to a string for test purposes while you would probably be writing to a file.)



来源:https://stackoverflow.com/questions/28837438/use-c-sharp-xmlserializer-to-write-in-chunks-for-large-sets-of-objects-to-avoid

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!