Quicker way to deepcopy objects in golang

僤鯓⒐⒋嵵緔 提交于 2019-11-30 20:17:49

JSON vs gob difference

The encoding/gob package needs to transmit type definitions:

The implementation compiles a custom codec for each data type in the stream and is most efficient when a single Encoder is used to transmit a stream of values, amortizing the cost of compilation.

When you "first" serialize a value of a type, the definition of the type also has to be included / transmitted, so the decoder can properly interpret and decode the stream:

A stream of gobs is self-describing. Each data item in the stream is preceded by a specification of its type, expressed in terms of a small set of predefined types.

This is explained in great details here: Efficient Go serialization of struct to disk

So while in your case it's necessary to create a new gob encoder and decoder each time, it is still the "bottleneck", the part that makes it slow. Encoding to / decoding from JSON format, type description is not included in the representation.

To prove it, make this simple change:

type Test struct {
    Prop1 [1000]int
    Prop2 [1000]string
}

What we did here is made the types of fields arrays, "multiplying" the values a thousand times, while the type information is effectively remained the same (all elements in the arrays have the same type). Creating values of them like this:

r := Test{Prop1: [1000]int{}, Prop2: [1000]string{}}

Now running your test program, the output on my machine:

Original:

2017/10/17 14:55:53 Total items 3000, Clone avg. 33.63µs, DeepCopy avg. 2.326µs, Total Difference 93.910918ms

Modified version:

2017/10/17 14:56:38 Total items 3000, Clone avg. 119.899µs, DeepCopy avg. 462.608µs, Total Difference -1.02812648s

As you can see, in the original version JSON is faster, but in the modified version gob became faster, as the cost of transmitting type info amortized.

Testing / benching method

Now on to your testing method. This way of measuring performance is bad and can yield quite inaccurate results. Instead you should use Go's built-in testing and benchmark tools. For details, read Order of the code and performance.

Caveats of these cloning

These methods work with reflection and thus can only "clone" fields that are accessible via reflection, that is: exported. Also they often don't manage pointer equality. By this I mean if you have 2 pointer fields in a struct, both pointing to the same object (pointers being equal), after marshaling and unmarshaling, you'll get 2 different pointers pointing to 2 different values. This may even cause problems in certain situations.

The "proper" way of cloning

Considering the caveats mentioned above, often the proper way of cloning needs help from the "inside". That is, cloning a specific type is often only possible if that type (or the package of that type) provides this functionality.

Yes, providing a "manual" cloning functionality is not convenient, but on the other side it will outperform the above methods (maybe even by orders of magnitude), and will require the least amount of "working" memory required for the cloning process.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!