Why setting HashTable's length to a Prime Number is a good practice?

后端 未结 3 1134
被撕碎了的回忆
被撕碎了的回忆 2020-12-13 19:47

I was going through Eric Lippert\'s latest Blog post for Guidelines and rules for GetHashCode when i hit this para:

We could be even more clever here;

3条回答
  •  一生所求
    2020-12-13 20:05

    Say your bucket set length is a power of 2 - that makes the mod calculations quite fast. It also means that the bucket selection is determine solely by the top m bits of the hash code. (Where m = 32 - n, where n is the power of 2 being used). So it's like you're throwing away useful bits of the hashcode immediately.

    Or as in this blog post from 2006 puts it:

    Suppose your hashCode function results in the following hashCodes among others {x , 2x, 3x, 4x, 5x, 6x...}, then all these are going to be clustered in just m number of buckets, where m = table_length/GreatestCommonFactor(table_length, x). (It is trivial to verify/derive this). Now you can do one of the following to avoid clustering:

    ...

    Or simply make m equal to the table_length by making GreatestCommonFactor(table_length, x) equal to 1, i.e by making table_length coprime with x. And if x can be just about any number then make sure that table_length is a prime number.

提交回复
热议问题