Why is OfType<> faster than Cast<>?

后端 未结 4 1920
猫巷女王i
猫巷女王i 2020-12-15 03:39

In answer to the following question: How to convert MatchCollection to string array

Given The two Linq expressions:

var arr = Regex.Matches(strText,          


        
相关标签:
4条回答
  • 2020-12-15 04:25

    OfType() should be slower since doing safe type is check before an actual explicit cast operation, in the same time Cast() doing only explicit cast.

    Theoretically OfType woudl be faster in case of many elements with "wrong type", so loop enumerates further just after is check, in case of Cast() on the same collection you would end's up with an InvalidCastException on each element of "wrong type" so this would be relatively slower.

    Source code extracted using ILSpy:

    // System.Linq.Enumerable
    private static IEnumerable<TResult> OfType<TResult>(IEnumerable source)
    {
        if (source == null)
        {
            throw Error.ArgumentNull("source");
        }
    
        foreach (object current in source)
        {
            // **Type check**
            if (current is TResult)
            {
                // **Explicit cast**
                yield return (TResult)current;
            }
        }
        yield break;
    }
    
    // System.Linq.Enumerable
    public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)
    {
        IEnumerable<TResult> enumerable = source as IEnumerable<TResult>;
        if (enumerable != null)
        {
            return enumerable;
        }
        if (source == null)
        {
            throw Error.ArgumentNull("source");
        }
    
        foreach (object current in source)
        {
            // **Explicit cast only**
            yield return (TResult)current;
        }
        yield break;
    }
    
    0 讨论(0)
  • 2020-12-15 04:26

    Actually isof() first checks the type and then casts it, where as cast() just does the 2nd part. So obviously isof() will be slower than direct casting

    http://codenets.blogspot.in/2010/06/cast-vs-oftype.html

    0 讨论(0)
  • 2020-12-15 04:40

    My benchmarking does not agree with your benchmarking.

    I ran an identical benchmark to Alex's and got the opposite result. I then tweaked the benchmark somewhat and again observed Cast being faster than OfType.

    There's not much in it, but I believe that Cast does have the edge, as it should because its iterator is simpler. (No is check.)

    Edit: Actually after some further tweaking I managed to get Cast to be 50x faster than OfType.

    Below is the code of the benchmark that gives the biggest discrepancy I've found so far:

    Stopwatch sw1 = new Stopwatch();
    Stopwatch sw2 = new Stopwatch();
    
    var ma = Enumerable.Range(1, 100000).Select(i => i.ToString()).ToArray();
    
    var x = ma.OfType<string>().ToArray();
    var y = ma.Cast<string>().ToArray();
    
    for (int i = 0; i < 1000; i++)
    {
        if (i%2 == 0)
        {
            sw1.Start();
            var arr = ma.OfType<string>().ToArray();
            sw1.Stop();
            sw2.Start();
            var arr2 = ma.Cast<string>().ToArray();
            sw2.Stop();
        }
        else
        {
            sw2.Start();
            var arr2 = ma.Cast<string>().ToArray();
            sw2.Stop();
            sw1.Start();
            var arr = ma.OfType<string>().ToArray();
            sw1.Stop();
        }
    }
    Console.WriteLine("OfType: " + sw1.ElapsedMilliseconds.ToString());
    Console.WriteLine("Cast: " + sw2.ElapsedMilliseconds.ToString());
    Console.ReadLine();
    

    Tweaks I've made:

    • Perform the "generate a list of strings" work once, at the start, and "crystallize" it.
    • Perform one of each operation before starting timing - I'm not sure if this is necessary but I think it means the JITter generates code beforehand rather than while we're timing?
    • Perform each operation multiple times, not just once.
    • Alternate the order in case this makes a difference.

    On my machine this results in ~350ms for Cast and ~18000ms for OfType.

    I think the biggest difference is that we're no longer timing how long MatchCollection takes to find the next match. (Or, in my code, how long int.ToString() takes.) This drastically reduces the signal-to-noise ratio.

    Edit: As sixlettervariables pointed out, the reason for this massive difference is that Cast will short-circuit and not bother casting individual items if it can cast the whole IEnumerable. When I switched from using Regex.Matches to an array to avoid measuring the regex processing time, I also switched to using something castable to IEnumerable<string> and thus activated this short-circuiting. When I altered my benchmark to disable this short-circuiting, I get a slight advantage to Cast rather than a massive one.

    0 讨论(0)
  • 2020-12-15 04:44

    Just reverse the order of OfType and Cast in your method and you'll note that there is no difference. The first one always runs faster than the second one. This is a case of a bad microbenchmark.

    Wrapping your code in a loop to run them in random order:

    OfType: 1224
    Cast: 2815
    Cast: 2961
    OfType: 3010
    OfType: 3027
    Cast: 2987
    ...
    

    And then again:

    Cast: 1207
    OfType: 2781
    Cast: 2930
    OfType: 2964
    OfType: 2964
    OfType: 2987
    ...
    

    Lifting out the Regex.Matches, which appears to cause the problem:

    Cast: 1247
    OfType: 210
    OfType: 170
    Cast: 171
    ...
    

    and

    OfType: 1225
    Cast: 202
    OfType: 171
    Cast: 192
    Cast: 415
    

    So, no. OfType is not faster than Cast. And no, Cast is not faster than OfType.

    0 讨论(0)
提交回复
热议问题