Hashing Keys in Java

折月煮酒 提交于 2019-12-21 15:11:13

问题


In java, when I use a String as a key for Hashmap I get a little different result than when I use the string hashcode as a key in the HashMap.

Any insight?


回答1:


when I use the string hashcode as a key in the HashMap.

You mustn't use the hash code itself as the key. Hash codes aren't intended to be unique - it's entirely permitted for two non-equal values to have the same hash code. You should use the string itself as a key. The map will then compare hash codes first (to narrow down the candidate matches quickly) and then compare with equals for genuine string equality.

Of course, that's assuming your code really is as your question makes it, e.g.

HashMap<String, String> goodMap = new HashMap<String, String>();
goodMap.put("foo", "bar");

HashMap<Integer, String> badMap = new HashMap<Integer, String>();
badMap.put("foo".hashCode(), "bar");

If that's really what your code looks like, just use HashMap<String, String> instead.

From the docs for Object.hashCode() (emphasis mine):

The general contract of hashCode is:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.



回答2:


Of course. Different Strings can have the same hashCode, so if you store two such strings as keys in a map, you'll have two entries (since the strings are different). Whareas if you use their hashCode as the key, you'll have only one entry (since their hashCode is the same).

The hashCode isn't used to tell if two keys are equal. It's only used to assign a bucket to the key. Once the bucket is found, every key contained in the bucket is compared to the new key with equals, and the key is added to the bucket if no equal key can be found.




回答3:


The problem is that, even if two objects are different, doesn't mean that their hashcodes are also different.

Two different objects can share the same hashcode. So, you shouldn't have them as a HashMap key.

Also, because hash codes returned from Object.hashCode() method are of type int, you can only have 2^32 different values. That's why you will have "collisions" depending on the hashing algorithm, for different objects.

In short: -

!obj.equals(obj1) doesn't ensures that obj.hashCode() != obj1.hashCode().




回答4:


HashCodes can be same or different for same String so be careful with that. May be this is why you are getting a different result.

Here's another SO question on it. See Jon Skeet's accepted answer.




回答5:


You can use the hash code as the key only if the hash function is a perfect hash (see e.g. GPERF). As long as your key objects don't reside in memory you are correct that you will save memory.



来源:https://stackoverflow.com/questions/13208108/hashing-keys-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!