Why ArrayList grows at a rate of 1.5, but for Hashmap it's 2?

半世苍凉 提交于 2019-12-12 07:10:55

问题


As per Sun Java Implementation, during expansion, ArrayList grows to 3/2 it's initial capacity whereas for HashMap the expansion rate is double. What is reason behind this?

As per the implementation, for HashMap, the capacity should always be in the power of two. That may be a reason for HashMap's behavior. But in that case the question is, for HashMap why the capacity should always be in power of two?


回答1:


The expensive part at increasing the capacity of an ArrayList is copying the content of the backing array a new (larger) one.

For the HashMap, it is creating a new backing array and putting all map entries in the new array. And, the higher the capacity, the lower the risk of collisions. This is more expensive and explains, why the expansion factor is higher. The reason for 1.5 vs. 2.0? I consider this as "best practise" or "good tradeoff".




回答2:


for HashMap why the capacity should always be in power of two?

I can think of two reasons.

  1. You can quickly determine the bucket a hashcode goes in to. You only need a bitwise AND and no expensive modulo. int bucket = hashcode & (size-1);

  2. Let's say we have a grow factor of 1.7. If we start with a size 11, the next size would be 18, then 31. No problem. Right? But the hashcodes of Strings in Java, are calculated with a prime factor of 31. The bucket a string goes into,hashcode%31, is then determined only by the last character of the String. Bye bye O(1) if you store folders that all end in /. If you use a size of, for example, 3^n, the distribution will not get worse if you increase n. Going from size 3 to 9, every element in bucket 2, will now go to bucket 2,5 or 7, depending on the higher digit. It's like splitting each bucket in three pieces. So a size of integer growth factor would be preferred. (Off course this all depends on how you calculate hashcodes, but a arbitrary growth factor doesn't feel 'stable'.)




回答3:


The way HashMap is designed/implemented its underlying number of buckets must be a power of 2 (even if you give it a different size, it makes it a power of 2), thus it grows by a factor of two each time. An ArrayList can be any size and it can be more conservative in how it grows.




回答4:


Hashing takes advantage of distributing data evenly into buckets. The algorithm tries to prevent multiple entries in the buckets ("hash collisions"), as they will decrease performance.

Now when the capacity of a HashMap is reached, size is extended and existing data is re-distributed with the new buckets. If the size-increas would be too small, this re-allocation of space and re-dsitribution would happen too often.




回答5:


I can't give you a reason why this is so (you'd have to ask Sun developers), but to see how this happens take a look at source:

  1. HashMap: Take a look at how HashMap resizes to new size (source line 799)

         resize(2 * table.length);
    
  2. ArrayList: source, line 183:

    int newCapacity = (oldCapacity * 3)/2 + 1;
    

Update: I mistakenly linked to sources of Apache Harmony JDK - changed it to Sun's JDK.




回答6:


A general rule to avoid collisions on Maps is to keep to load factor max at around 0.75 To decrease possibility of collisions and avoid expensive copying process HashMap grows at a larger rate.

Also as @Peter says, it must be a power of 2.



来源:https://stackoverflow.com/questions/5040753/why-arraylist-grows-at-a-rate-of-1-5-but-for-hashmap-its-2

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!