发表新帖

发表新帖

Why setting HashTable's length to a Prime Number is a good practice?

后端未结

关注

 3  1134

被撕碎了的回忆 2020-12-13 19:47

I was going through Eric Lippert\'s latest Blog post for Guidelines and rules for GetHashCode when i hit this para:

We could be even more clever here;

3条回答

一生所求 (楼主)

2020-12-13 20:05

Say your bucket set length is a power of 2 - that makes the mod calculations quite fast. It also means that the bucket selection is determine solely by the top m bits of the hash code. (Where m = 32 - n, where n is the power of 2 being used). So it's like you're throwing away useful bits of the hashcode immediately.

Or as in this blog post from 2006 puts it:

Suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x...}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial to verify/derive this). Now you can do one of the following to avoid clustering:

...

Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题