Memory usage serializing chunked byte arrays with Protobuf-net

问题

In our application we have some data structures which amongst other things contain a chunked list of bytes (currently exposed as a List<byte[]>). We chunk bytes up because if we allow the byte arrays to be put on the large object heap then over time we suffer from memory fragmentation.

We've also started using Protobuf-net to serialize these structures, using our own generated serialization DLL.

However we've noticed that Protobuf-net is creating very large in-memory buffers while serializing. Glancing through the source code it appears that perhaps it can't flush its internal buffer until the entire List<byte[]> structure has been written because it needs to write the total length at the front of the buffer afterwards.

This unfortunately undoes our work with chunking the bytes in the first place, and eventually gives us OutOfMemoryExceptions due to memory fragmentation (the exception occurs at the time where Protobuf-net is trying to expand the buffer to over 84k, which obviously puts it on the LOH, and our overall process memory usage is fairly low).

If my analysis of how Protobuf-net is working is correct, is there a way around this issue?

Update

Based on Marc's answer, here is what I've tried:

[ProtoContract]
[ProtoInclude(1, typeof(A), DataFormat = DataFormat.Group)]
public class ABase
{
}

[ProtoContract]
public class A : ABase
{
    [ProtoMember(1, DataFormat = DataFormat.Group)]
    public B B
    {
        get;
        set;
    }
}

[ProtoContract]
public class B
{
    [ProtoMember(1, DataFormat = DataFormat.Group)]
    public List<byte[]> Data
    {
        get;
        set;
    }
}

Then to serialize it:

var a = new A();
var b = new B();
a.B = b;
b.Data = new List<byte[]>
{
    Enumerable.Range(0, 1999).Select(v => (byte)v).ToArray(),
    Enumerable.Range(2000, 3999).Select(v => (byte)v).ToArray(),
};

var stream = new MemoryStream();
Serializer.Serialize(stream, a);

However if I stick a breakpoint in ProtoWriter.WriteBytes() where it calls DemandSpace() towards the bottom of the method and step into DemandSpace(), I can see that the buffer isn't being flushed because writer.flushLock equals 1.

If I create another base class for ABase like this:

[ProtoContract]
[ProtoInclude(1, typeof(ABase), DataFormat = DataFormat.Group)]
public class ABaseBase
{
}

[ProtoContract]
[ProtoInclude(1, typeof(A), DataFormat = DataFormat.Group)]
public class ABase : ABaseBase
{
}

Then writer.flushLock equals 2 in DemandSpace().

I'm guessing there is an obvious step I've missed here to do with derived types?

回答1:

I'm going to read between some lines here... because List<T> (mapped as repeated in protobuf parlance) doesn't have an overall length-prefix, and byte[] (mapped as bytes) has a trivial length-prefix that shouldn't cause additional buffering. So I'm guessing what you actually have is more like:

[ProtoContract]
public class A {
    [ProtoMember(1)]
    public B Foo {get;set;}
}
[ProtoContract]
public class B {
    [ProtoMember(1)]
    public List<byte[]> Bar {get;set;}
}

Here, the need to buffer for a length-prefix is actually when writing A.Foo, basically to declare "the following complex data is the value for A.Foo"). Fortunately there is a simple fix:

[ProtoMember(1, DataFormat=DataFormat.Group)]
public B Foo {get;set;}

This changes between 2 packing techniques in protobuf:

the default (google's stated preference) is length-prefixed, meaning you get a marker indicating the length of the message to follow, then the sub-message payload
but there is also an option to use a start-marker, the sub-message payload, and an end-marker

When using the second technique it doesn't need to buffer, so: it doesn't. This does mean it will be writing slightly different bytes for the same data, but protobuf-net is very forgiving, and will happily deserialize data from either format here. Meaning: if you make this change, you can still read your existing data, but new data will use the start/end-marker technique.

This demands the question: why do google prefer the length-prefix approach? Probably this is because it is more efficient when reading to skip through fields (either via a raw reader API, or as unwanted/unexpected data) when using the length-prefix approach, as you can just read the length-prefix, and then just progress the stream [n] bytes; by contrast, to skip data with a start/end-marker you still need to crawl through the payload, skipping the sub-fields individually. Of course, this theoretical difference in read performance doesn't apply if you expect that data and want to read it into your object, which you almost certainly do. Also, in the google protobuf implementation, because it isn't working with a regular POCO model, the size of the payloads are already known, so they don't really see the same issue when writing.

回答2:

Additional re your edit; the [ProtoInclude(..., DataFormat=...)] looks like it simply wasn't being processed. I have added a test for this in my current local build, and it now passes:

[Test]
public void Execute()
{

    var a = new A();
    var b = new B();
    a.B = b;

    b.Data = new List<byte[]>
    {
        Enumerable.Range(0, 1999).Select(v => (byte)v).ToArray(),
        Enumerable.Range(2000, 3999).Select(v => (byte)v).ToArray(),
    };

    var stream = new MemoryStream();
    var model = TypeModel.Create();
    model.AutoCompile = false;
#if DEBUG // this is only available in debug builds; if set, an exception is
  // thrown if the stream tries to buffer
    model.ForwardsOnly = true;
#endif
    CheckClone(model, a);
    model.CompileInPlace();
    CheckClone(model, a);
    CheckClone(model.Compile(), a);
}
void CheckClone(TypeModel model, A original)
{
    int sum = original.B.Data.Sum(x => x.Sum(b => (int)b));
    var clone = (A)model.DeepClone(original);
    Assert.IsInstanceOfType(typeof(A), clone);
    Assert.IsInstanceOfType(typeof(B), clone.B);
    Assert.AreEqual(sum, clone.B.Data.Sum(x => x.Sum(b => (int)b)));
}

This commit is tied into some other, unrelated refactorings (some rework for WinRT / IKVM), but should be committed ASAP.

来源：https://stackoverflow.com/questions/11317045/memory-usage-serializing-chunked-byte-arrays-with-protobuf-net

标签

protobuf-net

chunking

large-object-heap