Array vs Slice: accessing speed

风流意气都作罢 提交于 2019-11-28 21:25:27

Comparing the amd64 assembly of both BenchmarkArrayLocal and BenchmarkSliceLocal (too long to fit in this post):

The array version loads the address of a from memory multiple times, practically on every array-access operation:

LEAQ    "".a+1000(SP),BX

Whereas the slice version is computing exclusively on registers after loading once from memory:

LEAQ    (DX)(SI*1),BX

This is not conclusive but probably the cause. Reason being that both methods are otherwise virtually identical. One other notable detail is that the array version calls into runtime.duffcopy, which is a quite long assembly routine, whereas the slice version doesn't.

Go version 1.8 can eliminate some range checks so the difference got bigger.

BenchmarkSliceGlobal-4 500000 3220 ns/op BenchmarkArrayGlobal-4 1000000 1287 ns/op BenchmarkSliceLocal-4 1000000 1267 ns/op BenchmarkArrayLocal-4 1000000 1301 ns/op

For arrays I'd recommend to use sizes from powers of two and include a logical and operation. In that way you're sure the compiler eliminates the check. Thus var ga [1024]byte with ga[j & 1023].

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!