How do HashSets in Java work? [duplicate]

 ̄綄美尐妖づ 提交于 2019-11-28 05:59:18
amit

First, HashSet, unlike ArrayList is a Set: It cannot contain duplicates while ArrayList can - so they are built for different purposes. It also does not guarantee ordering - again, unlike a list.

Second - a HashSet is built on the hash table data structure, that allows O(1) seek time for an element.

Note that many times, a HashSet is slower then an ArrayList - if you want to iterate on elements for example - usually doing it in an ArrayList will be faster then in a HashSet [because of bad cache performance of hash, among other reasons]

Bozho

A HashSet is actually a HashMap where the value is always the same.

The way a HashMap works is described in many places (it is referred to as "hashtable" as well). In short: it generates hashes of keys (objects) and positions them into a table. Then each time you look for a key, its hash is computed and the bucket in the table is referenced directly. This means you have just one operation (best case) to access the map.

The HashSet simply contains the keys, so .contains(..) is O(1). That and remove(..) are the only operations a HashSet is faster than an ArrayList (which is O(n)). Iteration is the same, addition is the same.

These are 2 different data structures.

The concept behind HashSet is key probing.
I.e. you use a transformation of the input key to get an index of the location of the value in an array.
This is a constant O(1) operation since an array allows random access.

The arraylist is also O(1) operation for access since it is also backed by an array. But only for random access and insertion.

The search though is O(N) operation for an arraylist since you have to search through all the elemements in the list to get to the value unlike the HashSet where you just transform the key and access the array. Search in a HashSet is O(1)

As a matter of fact, for example iterating over and appending to an ArrayList is faster.

And heck, you cannot even sort a HashSet.

But the fastest of all is the NoOp. There is nothing just remotely as fast as the NoOp. Granted, it doesn't do much, the NoOp. But it's really fast at that!

You need to be more precise in what you consider to be "faster than".

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!