Immutable Dictionary Vs Dictionary Vs C5 Vs F# - performance

前端 未结 5 611
北荒
北荒 2020-12-30 01:19

Our application uses plenty of dictionaries which have multi level lookup that are not frequently changing. We are investigating at converting some of the critical code that

相关标签:
5条回答
  • 2020-12-30 02:01

    With .NET 3.1 they seem very similar on small amount.

    |    Method | MAXITEMS |      Mean |     Error |    StdDev |
    |---------- |--------- |----------:|----------:|----------:|
    |       imm |       10 | 0.9921 ns | 0.0630 ns | 0.1837 ns |
    |       scg |       10 | 0.9699 ns | 0.0571 ns | 0.1683 ns |
    | scgsorted |       10 | 1.0136 ns | 0.0577 ns | 0.1684 ns |
    |       imm |      100 | 1.5296 ns | 0.1153 ns | 0.3327 ns |
    |       scg |      100 | 1.3151 ns | 0.0694 ns | 0.1990 ns |
    | scgsorted |      100 | 1.4516 ns | 0.0855 ns | 0.2426 ns |
    |       imm |     1000 | 0.8514 ns | 0.0905 ns | 0.2582 ns |
    |       scg |     1000 | 1.0898 ns | 0.0552 ns | 0.1416 ns |
    | scgsorted |     1000 | 1.0302 ns | 0.0533 ns | 0.1001 ns |
    |       imm |    10000 | 1.0280 ns | 0.0530 ns | 0.1175 ns |
    |       scg |    10000 | 0.9929 ns | 0.0523 ns | 0.1253 ns |
    | scgsorted |    10000 | 1.0531 ns | 0.0534 ns | 0.1248 ns |
    

    Test source code:

    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Running;
    
    namespace CollectionsTest
    {
        using System;
        using System.Collections.Generic;
        using System.Collections.Immutable;
        using System.Runtime;
    
        /// <summary>
        ///
        /// </summary>
        public class Program
        {
            private static ImmutableDictionary<string, object> idictionary;
            private static Dictionary<string, object> dictionary;
            private static SortedDictionary<string, object> sorteddictionary;
            private static string[] randomIndexes;
    
            [Params(10, 100, 1000, 10000)] public  int MAXITEMS { get; set; }
    
            public Program()
            {
                Console.WriteLine("\n# - {0}", MAXITEMS);
    
                List<string> accessIndex = new List<string>(MAXITEMS);
                List<KeyValuePair<string, object>> listofkvps = new List<KeyValuePair<string, object>>();
                List<Tuple<string, object>> listoftuples = new List<Tuple<string, object>>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    listoftuples.Add(new Tuple<string, object>(i.ToString(), i));
                    listofkvps.Add(new KeyValuePair<string, object>(i.ToString(), i));
                    accessIndex.Add(i.ToString());
                }
    
                // Randomize for lookups
                Random r = new Random(Environment.TickCount);
                List<string> randomIndexesList = new List<string>(MAXITEMS);
                while (accessIndex.Count > 0)
                {
                    int index = r.Next(accessIndex.Count);
                    string value = accessIndex[index];
                    accessIndex.RemoveAt(index);
    
                    randomIndexesList.Add(value);
                }
    
                // Convert to array for best perf
                randomIndexes = randomIndexesList.ToArray();
    
                // LOAD ------------------------------------------------------------------------------------------------
    
                // IMMU
                idictionary = listofkvps.ToImmutableDictionary();
                //Console.WriteLine(idictionary.Count);
    
                // SCGD
                dictionary = new Dictionary<string, object>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    dictionary.Add(i.ToString(), i);
                }
    
                sorteddictionary = new SortedDictionary<string, object>();
                for (int i = 0; i < MAXITEMS; i++)
                {
                    sorteddictionary.Add(i.ToString(), i);
                }
                //Console.WriteLine(dictionary.Count);
    
    
                //scg(randomIndexes, dictionary);
    
    
                //imm(randomIndexes, idictionary);
            }
    
            /// <summary>
            /// Mains the specified args.
            /// </summary>
            /// <param name="args">The args.</param>
            public static void Main(string[] args)
            {
                // INIT TEST DATA ------------------------------------------------------------------------------------------------
    
    
    
                Console.WriteLine(BenchmarkRunner.Run<Program>());
            }
    
            [Benchmark]
            public   void imm()
            {
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    object value;
                    idictionary.TryGetValue(i, out value);
                }
            }
    
            [Benchmark]
            public   void scg()
            {
                // TEST ------------------------------------------------------------------------------------------------
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    object value;
                    dictionary.TryGetValue(i, out value);
                }
            }
    
            [Benchmark]
            public   void scgsorted()
            {
                // TEST ------------------------------------------------------------------------------------------------
                for (int index = 0, indexMax = randomIndexes.Length; index < indexMax; index++)
                {
                    string i = randomIndexes[index];
                    object value;
                    sorteddictionary.TryGetValue(i, out value);
                }
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-30 02:06

    The F# map structure is implemented as a binary tree and, as such, is not actually a dictionary. As noted here, a standard .net dictionary is the fastest you're going to get.

    0 讨论(0)
  • 2020-12-30 02:15

    Here's some more benchmarks.

    .NET 4.5 version 1.3.1 of System.Collections.Immutable package, 64-bit.

    Dictionary vs ImmutableDictionary:

    BuilderBenchmark elapsed time
     Mutable   : Avg= 15.213, Stdev  4.591 [ms] (   10.2x)
     Immutable : Avg=155.883, Stdev 15.145 [ms]
    BuilderBenchmark per op
     Mutable   : Avg=  0.152, Stdev  0.046 [us] (   10.2x)
     Immutable : Avg=  1.559, Stdev  0.151 [us]
    SetItemBenchmark elapsed time
     Mutable   : Avg= 13.100, Stdev  2.975 [ms] (   30.4x)
     Immutable : Avg=397.932, Stdev 18.551 [ms]
    SetItemBenchmark per op
     Mutable   : Avg=  0.131, Stdev  0.030 [us] (   30.4x)
     Immutable : Avg=  3.979, Stdev  0.186 [us]
    LookupBenchmark elapsed time
     Mutable   : Avg=  9.439, Stdev  0.942 [ms] (    3.6x)
     Immutable : Avg= 34.250, Stdev  3.457 [ms]
    LookupBenchmark per op
     Mutable   : Avg=  0.094, Stdev  0.009 [us] (    3.6x)
     Immutable : Avg=  0.343, Stdev  0.035 [us]
    

    Dictionary vs ImmutableSortedDictionary:

    BuilderBenchmark elapsed time
     Mutable   : Avg= 13.654, Stdev  5.124 [ms] (   34.5x)
     Immutable : Avg=471.574, Stdev 20.719 [ms]
    BuilderBenchmark per op
     Mutable   : Avg=  0.137, Stdev  0.051 [us] (   34.5x)
     Immutable : Avg=  4.716, Stdev  0.207 [us]
    SetItemBenchmark elapsed time
     Mutable   : Avg= 11.838, Stdev  0.530 [ms] (   37.6x)
     Immutable : Avg=444.964, Stdev 11.125 [ms]
    SetItemBenchmark per op
     Mutable   : Avg=  0.118, Stdev  0.005 [us] (   37.6x)
     Immutable : Avg=  4.450, Stdev  0.111 [us]
    LookupBenchmark elapsed time
     Mutable   : Avg=  9.354, Stdev  0.542 [ms] (    4.4x)
     Immutable : Avg= 40.988, Stdev  3.242 [ms]
    LookupBenchmark per op
     Mutable   : Avg=  0.094, Stdev  0.005 [us] (    4.4x)
     Immutable : Avg=  0.410, Stdev  0.032 [us]
    

    I wanted to know how much slower immutable collections would be. Note that the entire run is for 100,000 insert operations. I'm happy to report that the lookup performance degradation is only 4x while the insert performance degradation is 10x, still pretty decent. ImmutableDictionary is clearly superior to ImmutableSortedDictionary unless you absolutely need a sorted data structure.

    Side note on copy on write behavior. These persistent data structures are several orders of magnitude faster than any naive copy on write implementation. Which was what I was using. It's also very easy to submit changes (with data race detection) using the compare and swap CAS instruction.

    Attachments

    Program.cs:

    using System;
    using System.Collections.Generic;
    using System.Collections.Immutable;
    using System.Diagnostics;
    using System.Linq;
    
    using ImmutableDictionary = System.Collections.Immutable.ImmutableDictionary; // select implementation to benchmark here
    
    namespace DictPerf
    {
        class Program
        {
            static string alphaNum = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    
            static string NextString(Random r, char[] buf)
            {
                int i = 0, len = r.Next(buf.Length) + 1;
                for (; i < len; i++)
                {
                    buf[i] = alphaNum[r.Next(alphaNum.Length)];
                }
                return new string(buf, 0, len);
            }
    
            static HashSet<string> strings = new HashSet<string>();
    
            private static void Seed()
            {
                var r = new Random();
                var buf = new char[64];
                for (int i = 0; i < 100000; i++)
                {
                    strings.Add(NextString(r, buf));
                }
            }
    
            static void Main(string[] args)
            {
                Seed();
    
                Benchmark(RunDictionaryBuilderBenchmark, RunImmutableDictionaryBuilderBenchmark, "BuilderBenchmark");
                Benchmark(RunDictionarySetItemBenchmark, RunImmutableDictionarySetItemBenchmark, "SetItemBenchmark");
                Benchmark(RunDictionaryLookupBenchmark, RunImmutableDictionaryLookupBenchmark, "LookupBenchmark");
            }
    
            private static string Stats(IEnumerable<double> source)
            {
                var avg = source.Average();
                var variance = source.Select(val => (val - avg) * (val - avg)).Sum();
                var stdev = Math.Sqrt(variance / (source.Count() - 1));
                return $"Avg={avg,7:0.000}, Stdev{stdev,7:0.000}";
            }
    
            private static void Benchmark(Action<ICollection<string>, Stopwatch> benchmark1, Action<ICollection<string>, Stopwatch> benchmark2, string benchmarkName)
            {
                var xs = new List<double>();
                var ys = new List<double>();
    
                var sw = Stopwatch.StartNew();
                for (int i = 0; i < 10; i++)
                {
                    sw.Restart();
                    benchmark1(strings, sw);
                    xs.Add(sw.Elapsed.TotalMilliseconds);
                    sw.Restart();
                    benchmark2(strings, sw);
                    ys.Add(sw.Elapsed.TotalMilliseconds);
                }
    
                var x = xs.Average();
                var y = ys.Average();
                var a = xs.Select(v => v / 100).Average();
                var b = ys.Select(v => v / 100).Average();
    
                Console.WriteLine($"{benchmarkName} elapsed time");
                Console.WriteLine($" Mutable   : {Stats(xs)} [ms] ({y / x,7:0.0}x)");
                Console.WriteLine($" Immutable : {Stats(ys)} [ms]");
                Console.WriteLine($"{benchmarkName} per op");
                Console.WriteLine($" Mutable   : {Stats(xs.Select(v => v / 100))} [us] ({b / a,7:0.0}x)");
                Console.WriteLine($" Immutable : {Stats(ys.Select(v => v / 100))} [us]");
            }
    
            private static void RunDictionaryBuilderBenchmark(ICollection<string> strings, Stopwatch sw)
            {
                var d = new Dictionary<string, object>();
                foreach (var s in strings)
                {
                    d[s] = null;
                }
            }
    
            private static void RunImmutableDictionaryBuilderBenchmark(ICollection<string> strings, Stopwatch sw)
            {
                var d = ImmutableDictionary.CreateBuilder<string, object>();
                foreach (var s in strings)
                {
                    d[s] = null;
                }
                d.ToImmutableDictionary();
            }
    
            private static void RunDictionarySetItemBenchmark(ICollection<string> strings, Stopwatch sw)
            {
                var d = new Dictionary<string, object>();
                foreach (var s in strings)
                {
                    d[s] = null;
                }
            }
    
            private static void RunImmutableDictionarySetItemBenchmark(ICollection<string> strings, Stopwatch sw)
            {
                var d = ImmutableDictionary.Create<string, object>();
                foreach (var s in strings)
                {
                    d = d.SetItem(s, null);
                }
            }
    
            private static void RunDictionaryLookupBenchmark(ICollection<string> strings, Stopwatch timer)
            {
                timer.Stop();
    
                var d = new Dictionary<string, object>();
                foreach (var s in strings)
                {
                    d[s] = null;
                }
    
                timer.Start();
    
                foreach (var s in strings)
                {
                    object v;
                    d.TryGetValue(s, out v);
                }
            }
    
            private static void RunImmutableDictionaryLookupBenchmark(ICollection<string> strings, Stopwatch timer)
            {
                timer.Stop();
    
                var d = ImmutableDictionary.CreateBuilder<string, object>();
                foreach (var s in strings)
                {
                    d[s] = null;
                }
                var x = d.ToImmutableDictionary();
    
                timer.Start();
    
                foreach (var s in strings)
                {
                    object v;
                    x.TryGetValue(s, out v);
                }
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-30 02:20

    The standard dictionary is already quite well optimized. All it really does when you do a lookup is calculate the hash of the provided key (the speed of which depends on the type of the key and how it implements GetHashCode), a modulo operation on the hash value to find the right bucket and then it iterates through the contents of the bucket until it finds the right value (the speed of which depends on the quality of the GetHashCode function, so if the buckets are well balanced and don't contain too many items, and the speed of the Equals method for the type).

    All in all, it doesn't do that much for lookups, so I don't think you'll be able to find a significantly faster generic data structure. However, it's possible that depending on how you plan to use the dictionaries, you're able to build a more specialized solution. For example, I needed a really fast lookup where the key was a type. Instead of using a dictionary and doing dictionary[typeof(T)], I made a generic class like so:

    class ValueStore<T> 
    {
      public static T Value;
    }
    

    So I could just do ValueStore<T>.Value with pretty much zero lookup overhead.

    Whether or not you could do something similar (and whether it's worth it) really depends on your usecase; how many items the structure would hold, how often it's being read and written to, whether it needs to be threadsafe, how important write speed is, etc. For example, if write speed didn't matter at all but if thread safety was required, you would need to do a copy-on-write, where the data structure is never written to but instead copied, ensuring thread safety and lockless (thus: fast) reads, at the cost of write speeds. Specializing it could go as far as reordering the structure on writes to optimize it so the hash buckets don't contain more than N items.

    PS: if you were really desperate for speed but couldn't build a more specialized data structure then you could perhaps get small gains from copying Dictionary<TKey,TValue> and removing the various sanity checks (null checks and such) and virtual/interface method calls. However, I doubt this would give you any more than 20% gain, if that.

    0 讨论(0)
  • 2020-12-30 02:22

    Your presumption that immutable dictionaries allow faster lookup is wrong, because the way almost all immutable collections manage to avoid copying the whole structure on 'modification' is by storing data in a tree and only copying some of the nodes on 'modification', sharing all other nodes. And tree access will generally be slower than accessing a flat array by index, as the mutable cousins do.

    I compared the single-thread read performance of Dictionary<,>, ConcurrentDictionary<,>, and ImmutableDictionary<,> based on your code.

    The average results of 30 runs, after warmup:

    Read performance for various dictionary implementations

    To get a feel for write performance, I also ran a test which adds 50 more entries to the dictionaries. Again, the average results of 30 runs, after warmup:

    Write performance for various dictionary implementations

    Tested on

    • .net 4.5.1
    • Microsoft Bcl Immutable 1.0.34.0

    N.B. It should be noted that immutable dictionaries are so much faster and/or allow for higher levels of concurrency in many real life multi-threaded applications that otherwise would have to resort to slow or error-prone techniques like defensive copying, locking and the likes, in order to cope with mutability in the face of threads. This is especially true if you need snapshot-abilty such as for optimistic concurrency, MVCC.

    Btw, if you run your sample code as is, the value for at least the immutable dictionary will be highly untypical in a normal (longer running) application, because for some reason the immutable dictionary apparently needs time to warm up. The difference in performance is enormous. Just have a look at the output of the first 3 runs:

    Items    Dict   Conc   Immu
    ===========================
       100   1.90   1.00 361.81
      1000   1.07   1.00   4.33
     10000   1.24   1.00   1.74
    100000   1.00   1.33   2.71
    ---------------------------
       100   1.06   1.00   2.56
      1000   1.03   1.00   4.34
     10000   1.00   1.06   3.54
    100000   1.00   1.17   2.76
    ---------------------------
       100   1.06   1.00   2.50
      1000   1.66   1.00   4.16
     10000   1.00   1.02   3.67
    100000   1.00   1.26   3.13
    

    Your question was about read performance (of frozen dictionaries), but the tree characteristics of immutable collections show similarly in the write performance:

    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
                           Mutable (amortized)  Mutable (worst)  Immutable 
    ───────────────────────────────────────────────────────────────────────
     Stack.Push            O(1)                 O(n)             O(1)      
     Queue.Enqueue         O(1)                 O(n)             O(1)      
     List.Add              O(1)                 O(n)             O(log n)  
     HashSet.Add           O(1)                 O(n)             O(log n)  
     SortedSet.Add         O(log n)             O(n)             O(log n)  
     Dictionary.Add        O(1)                 O(n)             O(log n)  
     SortedDictionary.Add  O(log n)             O(n log n)       O(log n)  
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
    
    0 讨论(0)
提交回复
热议问题