How does a Java HashMap handle different objects with the same hash code?

前端 未结 14 1523
余生分开走
余生分开走 2020-11-22 02:27

As per my understanding I think:

  1. It is perfectly legal for two objects to have the same hashcode.
  2. If two objects are equal (using the equals() method)
14条回答
  •  迷失自我
    2020-11-22 03:17

    Here is a rough description of HashMap's mechanism, for Java 8 version, (it might be slightly different from Java 6).


    Data structures

    • Hash table
      Hash value is calculated via hash() on key, and it decide which bucket of the hashtable to use for a given key.
    • Linked list (singly)
      When count of elements in a bucket is small, a singly linked list is used.
    • Red-Black tree
      When count of elements in a bucket is large, a red-black tree is used.

    Classes (internal)

    • Map.Entry
      Represent a single entity in map, the key/value entity.
    • HashMap.Node
      Linked list version of node.

      It could represent:

      • A hash bucket.
        Because it has a hash property.
      • A node in singly linked list, (thus also head of linkedlist).
    • HashMap.TreeNode
      Tree version of node.

    Fields (internal)

    • Node[] table
      The bucket table, (head of the linked lists).
      If a bucket don't contains elements, then it's null, thus only take space of a reference.
    • Set entrySet Set of entities.
    • int size
      Number of entities.
    • float loadFactor
      Indicate how full the hash table is allowed, before resizing.
    • int threshold
      The next size at which to resize.
      Formula: threshold = capacity * loadFactor

    Methods (internal)

    • int hash(key)
      Calculate hash by key.
    • How to map hash to bucket?
      Use following logic:

      static int hashToBucket(int tableSize, int hash) {
          return (tableSize - 1) & hash;
      }
      

    About capacity

    In hash table, capacity means the bucket count, it could be get from table.length.
    Also could be calculated via threshold and loadFactor, thus no need to be defined as a class field.

    Could get the effective capacity via: capacity()


    Operations

    • Find entity by key.
      First find the bucket by hash value, then loop linked list or search sorted tree.
    • Add entity with key.
      First find the bucket according to hash value of key.
      Then try find the value:
      • If found, replace the value.
      • Otherwise, add a new node at beginning of linked list, or insert into sorted tree.
    • Resize
      When threshold reached, will double hashtable's capacity(table.length), then perform a re-hash on all elements to rebuild the table.
      This could be an expensive operation.

    Performance

    • get & put
      Time complexity is O(1), because:
      • Bucket is accessed via array index, thus O(1).
      • Linked list in each bucket is of small length, thus could view as O(1).
      • Tree size is also limited, because will extend capacity & re-hash when element count increase, so could view it as O(1), not O(log N).

提交回复
热议问题