Why chose 31 to do the multiplication in the hashcode() implementation ? [duplicate]

Possible Duplicate:
Why does Java's hashCode() in String use 31 as a multiplier?
Why use a prime number in hashCode?

From Effective Java Item 9: Always override hashCode when you override equals consider the following relevant code snippet that overrides the hashcode() defined in Object class.

public final class PhoneNumber {
  private final short areaCode;
  private final short prefix;
  private final short lineNumber;
  ......//Rest ignored on purpose
  .....

private volatile int hashCode; // (See Item 71)
@Override public int hashCode() {
   int result = hashCode;
   if (result == 0) {
    result = 17;
    result = 31 * result + areaCode;
    result = 31 * result + prefix;
    result = 31 * result + lineNumber;
    hashCode = result;
   }
 return result;
 }
}

I understand why a non zero initial value "17" is chosen . I also understand the that multiplication is done in each step so that the order of fields play an important role in computing hashcode(). But I do not understand the reason for choosing 31 as the value for multiplication . Can some one explain to me why ? This is what Bloch has to say about 31 but I do not really get it . I specifically fail to understand the line in italics below.

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional.

Shifting left just introduces a zero on the right and loses a bit on the left of the number's binary representation, so it's a clear information loss. Repeating this process gradually loses all information that was accumulated from earlier computation. That means that the more fields enter your hashcode calculation, the less effect on the final result the early fields have.

he reason for using a prime is that it is more likely to produce a random pattern. If you use 9 for example you can get over lap with multiples of 3.

AFAIK 31 is used for Strings as there is less than 31 letters in the alphabet meaning all words of up to 6 letters have a unique hash code. If you use 61 (prime less than 64) for example, up to 5 letters would produce unique codes and if you use 13 (prime less than 16) you can get collisions with two letters words.

I'm going to describe the answer for a different number, but I suspect that the reasoning is similar. The contribution to the hash value of a character X is X*B^k, where B in your case is 31, and k depends on the position of X in the string and its length. This arithmetic is usually done modulo the word size. For this reason we want B^k to be different for different values of k.

Now, in "Handbook of Algorithms and Data Structures" by Gonnet and Baeza-Yates section 3.3.1 "Pratical hashing functions" they say "For this function the value B=131 is recommended, as B^i has a maximum cycle mod 2^k for 8 <= k <= 64." I wonder what cycle length 31 has mod 2^32? I believe that 31 will fit into a Sparc immediate operand, but 131 will not.

来源：https://stackoverflow.com/questions/11537005/why-chose-31-to-do-the-multiplication-in-the-hashcode-implementation

标签

java

algorithm

hashcode