Bad idea to use String key in HashMap?

前端 未结 5 418
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-12 10:43

I understand that the String class\' hashCode() method is not guarantied to generate unique hash codes for distinct String-s. I see a lot of usage of putting Strin

5条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-12 11:01

    I direct you to the answer here. While it is not a bad idea to use strings( @CPerkins explained why, perfectly), storing the values in a hashmap with integer keys is better, since it is generally quicker (although unnoticeably) and has lower chance (actually, no chance) of collisions.

    See this chart of collisions using 216553 keys in each case, (stolen from this post, reformatted for our discussion)

    Hash           Lowercase      Random UUID  Numbers 
    =============  =============  ===========  ==============
    Murmur            145 ns      259 ns          92 ns
                        6 collis    5 collis       0 collis
    FNV-1a            152 ns      504 ns          86 ns
                        4 collis    4 collis       0 collis
    FNV-1             184 ns      730 ns          92 ns
                        1 collis    5 collis       0 collis*
    DBJ2a             158 ns      443 ns          91 ns
                        5 collis    6 collis       0 collis***
    DJB2              156 ns      437 ns          93 ns
                        7 collis    6 collis       0 collis***
    SDBM              148 ns      484 ns          90 ns
                        4 collis    6 collis       0 collis**
    CRC32             250 ns      946 ns         130 ns
                        2 collis    0 collis       0 collis
    
    Avg Time per key    0.8ps       2.5ps         0.44ps
    Collisions (%)      0.002%      0.002%         0%
    

    Of course, the number of integers is limited to 2^32, where as there is no limit to the number of strings (and there is no theoretical limit to the amount of keys that can be stored in a HashMap). If you use a long (or even a float), collisions will be inevitable, and therefore no "better" than a string. However, even despite hash collisions, put() and get() will always put/get the correct key-value pair (See edit below).

    In the end, it really doesn't matter, so use whatever is more convenient. But if convenience makes no difference, and you do not intend to have more than 2^32 entries, I suggest you use ints as keys.


    EDIT

    While the above is definitely true, NEVER use "StringKey".hashCode() to generate a key in place of the original String key for performance reasons- 2 different strings can have the same hashCode, causing overwriting on your put() method. Java's implementation of HashMap is smart enough to handle strings (any type of key, actually) with the same hashcode automatically, so it is wise to let Java handle these things for you.

提交回复
热议问题