可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have been searching for a performance benchmarking between Contains
, Exists
and Any
methods available in the List
. I wanted to find this out just out of curiosity as I was always confused among these. Many questions on SO described definitions of these methods such as:
- LINQ Ring: Any() vs Contains() for Huge Collections
- Linq .Any VS .Exists - Whats the difference?
- LINQ extension methods - Any() vs. Where() vs. Exists()
So I decided to do it myself. I am adding it as an answer. Any more insight on the results is most welcomed. I also did this benchmarking for arrays to see the results
回答1:
According to documentation:
List.Exists (Object method)
Determines whether the List(T) contains elements that match the conditions defined by the specified predicate.
IEnumerable.Any (Extension method)
Determines whether any element of a sequence satisfies a condition.
List.Contains (Object Method)
Determines whether an element is in the List.
Benchmarking:
CODE:
static void Main(string[] args) { ContainsExistsAnyShort(); ContainsExistsAny(); } private static void ContainsExistsAny() { Console.WriteLine("***************************************"); Console.WriteLine("********* ContainsExistsAny ***********"); Console.WriteLine("***************************************"); List list = new List(6000000); Random random = new Random(); for (int i = 0; i list = new List(2000); Random random = new Random(); for (int i = 0; i list, int[] arr) { Random random = new Random(); int[] find = new int[10000]; for (int i = 0; i a == find[rpt]); } watch.Stop(); Console.WriteLine("List/Exists: {0:N0}ms", watch.ElapsedMilliseconds); watch = Stopwatch.StartNew(); for (int rpt = 0; rpt a == find[rpt]); } watch.Stop(); Console.WriteLine("List/Any: {0:N0}ms", watch.ElapsedMilliseconds); watch = Stopwatch.StartNew(); for (int rpt = 0; rpt a == find[rpt]); } watch.Stop(); Console.WriteLine("Array/Any: {0:N0}ms", watch.ElapsedMilliseconds); }
RESULTS
*************************************** ***** ContainsExistsAnyShortRange ***** *************************************** List/Contains: 96ms List/Exists: 146ms List/Any: 381ms Array/Contains: 34ms Arrays do not have Exists Array/Any: 410ms *************************************** ********* ContainsExistsAny *********** *************************************** List/Contains: 257,996ms List/Exists: 379,951ms List/Any: 884,853ms Array/Contains: 72,486ms Arrays do not have Exists Array/Any: 1,013,303ms
回答2:
The fastest way is to use a HashSet
. The Contains
for a HashSet
is O(1).
I took you code and added a benchmark for HashSet
The performance cost of HashSet set = new HashSet(list);
is nearly zero.
void Main() { ContainsExistsAnyShort(); ContainsExistsAny(); } private static void ContainsExistsAny() { Console.WriteLine("***************************************"); Console.WriteLine("********* ContainsExistsAny ***********"); Console.WriteLine("***************************************"); List list = new List(6000000); Random random = new Random(); for (int i = 0; i set = new HashSet(list); find(list, arr, set); } private static void ContainsExistsAnyShort() { Console.WriteLine("***************************************"); Console.WriteLine("***** ContainsExistsAnyShortRange *****"); Console.WriteLine("***************************************"); List list = new List(2000); Random random = new Random(); for (int i = 0; i set = new HashSet(list); find(list, arr, set); } private static void find(List list, int[] arr, HashSet set) { Random random = new Random(); int[] find = new int[10000]; for (int i = 0; i a == find[rpt]); } watch.Stop(); Console.WriteLine("List/Exists: {0}ms", watch.ElapsedMilliseconds); watch = Stopwatch.StartNew(); for (int rpt = 0; rpt a == find[rpt]); } watch.Stop(); Console.WriteLine("List/Any: {0}ms", watch.ElapsedMilliseconds); watch = Stopwatch.StartNew(); for (int rpt = 0; rpt a == find[rpt]); } watch.Stop(); Console.WriteLine("Array/Any: {0}ms", watch.ElapsedMilliseconds); watch = Stopwatch.StartNew(); for (int rpt = 0; rpt
RESULTS
*************************************** ***** ContainsExistsAnyShortRange ***** *************************************** List/Contains: 65ms List/Exists: 106ms List/Any: 222ms Array/Contains: 20ms Arrays do not have Exists Array/Any: 281ms HashSet/Contains: 0ms *************************************** ********* ContainsExistsAny *********** *************************************** List/Contains: 120522ms List/Exists: 250445ms List/Any: 653530ms Array/Contains: 40801ms Arrays do not have Exists Array/Any: 522371ms HashSet/Contains: 3ms
回答3:
It is worth mentioning that this comparison is a bit unfair since the Array
class doesn't own the Contains()
method, it uses an extension method for IEnumerable
via a sequential Enumerator
hence it is not optimized for Array
instances; on the other side, HashSet
has its own implementation fully optimized for all sizes.
To compare fairly you could use the static method int Array.IndexOf()
which is implemented for Array
instances even though it uses a for
loop slightly more efficient that an Enumerator
.
Having said that, the performance of HashSet.Contains()
is similar to the Array.IndexOf()
for small sets of, I would say, up to 5 elements and much more efficient for large sets.