How to store binary data when you only care about speed?

前端未结

关注

 3  1576

我寻月下人不归 2020-12-21 22:58

I have N points in D dimensions, where let\'s say N is 1 million and D 1 hundred. All my points have binary coordinates, i.e. {0, 1}^D, and I am only interested in speed

3条回答

清酒与你 (楼主)

2020-12-21 23:36

If the values are independently, uniformly distributed, and you want to find the Hamming distance between two independently, randomly chosen points, the most efficient layout is a packed array of bits.

This packed array would ideally be chunked into the largest block size over which your popcnt instruction works: 64 bits. The hamming distance is the sum of popcnt(x_blocks[i] ^ y_blocks[i]). On processors with efficient unaligned accesses, byte alignment with unaligned reads is likely to be most efficient. On processors where unaligned reads incur a penalty, one should consider whether the memory overhead of aligned rows is worth faster logic.

0 讨论(0)

查看其它3个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复