HashSet<T> versus Dictionary<K, V> w.r.t searching time to find if an item exists

吃可爱长大的小学妹 提交于 2019-11-26 05:20:11

问题


HashSet<T> t = new HashSet<T>();
// add 10 million items


Dictionary<K, V> t = new Dictionary<K, V>();
// add 10 million items.

Whose .Contains method will return quicker?

Just to clarify, my requirement is I have 10 million objects (well, strings really) that I need to check if they exist in the data structure. I will NEVER iterate.


回答1:


HashSet vs List vs Dictionary performance test, taken from here.

Add 1000000 objects (without checking duplicates)

Contains check for half the objects of a collection of 10000

Remove half the objects of a collection of 10000




回答2:


I assume you mean Dictionary<TKey, TValue> in the second case? HashTable is a non-generic class.

You should choose the right collection for the job based on your actual requirements. Do you actually want to map each key to a value? If so, use Dictionary<,>. If you only care about it as a set, use HashSet<>.

I would expect HashSet<T>.Contains and Dictionary<TKey, TValue>.ContainsKey (which are the comparable operations, assuming you're using your dictionary sensibly) to basically perform the same - they're using the same algorithm, fundamentally. I guess with the entries in Dictionary<,> being larger you end up with a greater likelihood of blowing the cache with Dictionary<,> than with HashSet<>, but I'd expect that to be insignificant compared with the pain of choosing the wrong data type simply in terms of what you're trying to achieve.




回答3:


From MSDN documentation for Dictionary<TKey,TValue>

"Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table."

With a note:

"The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey"

I know your question/post is old - but while looking for an answer to a similar question I stumbled across this.

Hope this helps. Scroll down to the Remarks section for more details. https://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.110).aspx




回答4:


These are different data structures. Also there is no generic version of HashTable.

HashSet contains values of type T which HashTable (or Dictionary) contains key-value pairs. So you should choose collection on what data you need to be stored.



来源:https://stackoverflow.com/questions/2728500/hashsett-versus-dictionaryk-v-w-r-t-searching-time-to-find-if-an-item-exist

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!