Performance Benchmarking of Contains, Exists and Any

匿名 (未验证) 提交于 2019-12-03 01:10:02

问题:

I have been searching for a performance benchmarking between Contains, Exists and Any methods available in the List. I wanted to find this out just out of curiosity as I was always confused among these. Many questions on SO described definitions of these methods such as:

  1. LINQ Ring: Any() vs Contains() for Huge Collections
  2. Linq .Any VS .Exists - Whats the difference?
  3. LINQ extension methods - Any() vs. Where() vs. Exists()

So I decided to do it myself. I am adding it as an answer. Any more insight on the results is most welcomed. I also did this benchmarking for arrays to see the results

回答1:

According to documentation:

List.Exists (Object method)

Determines whether the List(T) contains elements that match the conditions defined by the specified predicate.

IEnumerable.Any (Extension method)

Determines whether any element of a sequence satisfies a condition.

List.Contains (Object Method)

Determines whether an element is in the List.

Benchmarking:

CODE:

    static void Main(string[] args)     {         ContainsExistsAnyShort();          ContainsExistsAny();     }      private static void ContainsExistsAny()     {         Console.WriteLine("***************************************");         Console.WriteLine("********* ContainsExistsAny ***********");         Console.WriteLine("***************************************");          List list = new List(6000000);         Random random = new Random();         for (int i = 0; i  list = new List(2000);         Random random = new Random();         for (int i = 0; i  list, int[] arr)     {         Random random = new Random();         int[] find = new int[10000];         for (int i = 0; i  a == find[rpt]);         }         watch.Stop();         Console.WriteLine("List/Exists: {0:N0}ms", watch.ElapsedMilliseconds);          watch = Stopwatch.StartNew();         for (int rpt = 0; rpt  a == find[rpt]);         }         watch.Stop();         Console.WriteLine("List/Any: {0:N0}ms", watch.ElapsedMilliseconds);          watch = Stopwatch.StartNew();         for (int rpt = 0; rpt  a == find[rpt]);         }         watch.Stop();         Console.WriteLine("Array/Any: {0:N0}ms", watch.ElapsedMilliseconds);     } 

RESULTS

*************************************** ***** ContainsExistsAnyShortRange ***** *************************************** List/Contains: 96ms List/Exists: 146ms List/Any: 381ms Array/Contains: 34ms Arrays do not have Exists Array/Any: 410ms *************************************** ********* ContainsExistsAny *********** *************************************** List/Contains: 257,996ms List/Exists: 379,951ms List/Any: 884,853ms Array/Contains: 72,486ms Arrays do not have Exists Array/Any: 1,013,303ms 


回答2:

The fastest way is to use a HashSet. The Contains for a HashSet is O(1).

I took you code and added a benchmark for HashSet The performance cost of HashSet set = new HashSet(list); is nearly zero.

void Main() {     ContainsExistsAnyShort();      ContainsExistsAny(); }  private static void ContainsExistsAny() {     Console.WriteLine("***************************************");     Console.WriteLine("********* ContainsExistsAny ***********");     Console.WriteLine("***************************************");      List list = new List(6000000);     Random random = new Random();     for (int i = 0; i  set = new HashSet(list);      find(list, arr, set);  }  private static void ContainsExistsAnyShort() {     Console.WriteLine("***************************************");     Console.WriteLine("***** ContainsExistsAnyShortRange *****");     Console.WriteLine("***************************************");      List list = new List(2000);     Random random = new Random();     for (int i = 0; i  set = new HashSet(list);      find(list, arr, set);  }  private static void find(List list, int[] arr, HashSet set) {     Random random = new Random();     int[] find = new int[10000];     for (int i = 0; i  a == find[rpt]);     }     watch.Stop();     Console.WriteLine("List/Exists: {0}ms", watch.ElapsedMilliseconds);      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt  a == find[rpt]);     }     watch.Stop();     Console.WriteLine("List/Any: {0}ms", watch.ElapsedMilliseconds);      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt  a == find[rpt]);     }     watch.Stop();     Console.WriteLine("Array/Any: {0}ms", watch.ElapsedMilliseconds);      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt 

RESULTS

*************************************** ***** ContainsExistsAnyShortRange ***** *************************************** List/Contains: 65ms List/Exists: 106ms List/Any: 222ms Array/Contains: 20ms Arrays do not have Exists Array/Any: 281ms HashSet/Contains: 0ms *************************************** ********* ContainsExistsAny *********** *************************************** List/Contains: 120522ms List/Exists: 250445ms List/Any: 653530ms Array/Contains: 40801ms Arrays do not have Exists Array/Any: 522371ms HashSet/Contains: 3ms 


回答3:

It is worth mentioning that this comparison is a bit unfair since the Array class doesn't own the Contains() method, it uses an extension method for IEnumerable via a sequential Enumerator hence it is not optimized for Array instances; on the other side, HashSet has its own implementation fully optimized for all sizes.

To compare fairly you could use the static method int Array.IndexOf() which is implemented for Array instances even though it uses a for loop slightly more efficient that an Enumerator.

Having said that, the performance of HashSet.Contains() is similar to the Array.IndexOf() for small sets of, I would say, up to 5 elements and much more efficient for large sets.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!