问题
I like how XmlSerialize works, so simple and elegant and with attributes =p However, I am running into Out of Memory issue while building up a collection of all my objects prior to serializing to xml file.
I am populating an object from a SQL database and intend to write the object out to XML using XmlSerialize. It works great for small subsets but if I try to grab all the objects from the DB I reach an Out of Memory exception.
Is there some ability of XmlSerialize that would allow me to grab batches of 100 objects out of the database, then write them, grab the next batch of 100 objects and append to the xml?
I am hoping I dont have to bust out into XmlDocument or something that requires more manual coding efforts...
回答1:
XmlSerializer
can, in fact, stream enumerable data in and out when serializing. It has special handling for a class that implements IEnumerable<T>
. From the docs:
The XmlSerializer gives special treatment to classes that implement IEnumerable or ICollection. A class that implements IEnumerable must implement a public Add method that takes a single parameter. The Add method's parameter must be of the same type as is returned from the Current property on the value returned from GetEnumerator, or one of that type's bases.
When serializing such classes, XmlSerializer
simply iterates through the enumerable writing each current value to the output stream. It does not load the entire enumerable into a list first. Thus, if you have some Linq query that dynamically pages in results of type T
from a database in chunks (example here), you can serialize all of them out without loading them all at once using the following wrapper:
// Proxy class for any enumerable with the requisite `Add` methods.
public class EnumerableProxy<T> : IEnumerable<T>
{
[XmlIgnore]
public IEnumerable<T> BaseEnumerable { get; set; }
public void Add(T obj)
{
throw new NotImplementedException();
}
#region IEnumerable<T> Members
public IEnumerator<T> GetEnumerator()
{
if (BaseEnumerable == null)
return Enumerable.Empty<T>().GetEnumerator();
return BaseEnumerable.GetEnumerator();
}
#endregion
#region IEnumerable Members
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
#endregion
}
Note this class is only useful for serializing, not deserializing. Here is an example of how to use it:
public class RootObject<T>
{
[XmlIgnore]
public IEnumerable<T> Results { get; set; }
[XmlArray("Results")]
public EnumerableProxy<T> ResultsProxy {
get
{
return new EnumerableProxy<T> { BaseEnumerable = Results };
}
set
{
throw new NotImplementedException();
}
}
}
public class TestClass
{
XmlWriter xmlWriter;
TextWriter textWriter;
public void Test()
{
try
{
var root = new RootObject<int>();
root.Results = GetResults();
using (textWriter = new StringWriter())
{
var settings = new XmlWriterSettings { Indent = true, IndentChars = " " };
using (xmlWriter = XmlWriter.Create(textWriter, settings))
{
(new XmlSerializer(root.GetType())).Serialize(xmlWriter, root);
}
var xml = textWriter.ToString();
Debug.WriteLine(xml);
}
}
finally
{
xmlWriter = null;
textWriter = null;
}
}
IEnumerable<int> GetResults()
{
foreach (var i in Enumerable.Range(0, 1000))
{
if (i > 0 && (i % 500) == 0)
{
HalfwayPoint();
}
yield return i;
}
}
private void HalfwayPoint()
{
if (xmlWriter != null)
{
xmlWriter.Flush();
var xml = textWriter.ToString();
Debug.WriteLine(xml);
}
}
}
If you set a break in HalfwayPoint()
, you will see that half the XML has already been written out while still iterating through the enumerable. (Of course, I'm just writing to a string for test purposes while you would probably be writing to a file.)
来源:https://stackoverflow.com/questions/28837438/use-c-sharp-xmlserializer-to-write-in-chunks-for-large-sets-of-objects-to-avoid