Why chose 31 to do the multiplication in the hashcode() implementation ? [duplicate]

青春壹個敷衍的年華 提交于 2019-12-05 16:03:08

Shifting left just introduces a zero on the right and loses a bit on the left of the number's binary representation, so it's a clear information loss. Repeating this process gradually loses all information that was accumulated from earlier computation. That means that the more fields enter your hashcode calculation, the less effect on the final result the early fields have.

he reason for using a prime is that it is more likely to produce a random pattern. If you use 9 for example you can get over lap with multiples of 3.

AFAIK 31 is used for Strings as there is less than 31 letters in the alphabet meaning all words of up to 6 letters have a unique hash code. If you use 61 (prime less than 64) for example, up to 5 letters would produce unique codes and if you use 13 (prime less than 16) you can get collisions with two letters words.

I'm going to describe the answer for a different number, but I suspect that the reasoning is similar. The contribution to the hash value of a character X is X*B^k, where B in your case is 31, and k depends on the position of X in the string and its length. This arithmetic is usually done modulo the word size. For this reason we want B^k to be different for different values of k.

Now, in "Handbook of Algorithms and Data Structures" by Gonnet and Baeza-Yates section 3.3.1 "Pratical hashing functions" they say "For this function the value B=131 is recommended, as B^i has a maximum cycle mod 2^k for 8 <= k <= 64." I wonder what cycle length 31 has mod 2^32? I believe that 31 will fit into a Sparc immediate operand, but 131 will not.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!