How to do proper Parallel.ForEach, locking and progress reporting

后端 未结 6 1248
小鲜肉
小鲜肉 2020-12-17 18:48

I\'m trying to implement the Parallel.ForEach pattern and track progress, but I\'m missing something regarding locking. The following example counts to 1000 whe

相关标签:
6条回答
  • 2020-12-17 18:53

    The easiest solution would actually have been to replace the property with a field, and

    lock { ++progress.CurrentCount; }
    

    (I personally prefer the look of the preincrement over the postincrement, as the "++." thing clashes in my mind! But the postincrement would of course work the same.)

    This would have the additional benefit of decreasing overhead and contention, since updating a field is faster than calling a method that updates a field.

    Of course, encapsulating it as a property can have advantages too. IMO, since field and property syntax is identical, the ONLY advantage of using a property over a field, when the property is autoimplemented or equivalent, is when you have a scenario where you may want to deploy one assembly without having to build and deploy dependent assemblies anew. Otherwise, you may as well use faster fields! If the need arises to check a value or add a side effect, you simply convert the field to a property and build again. Therefore, in many practical cases, there is no penalty to using a field.

    However, we are living in a time where many development teams operate dogmatically, and use tools like StyleCop to enforce their dogmatism. Such tools, unlike coders, are not smart enough to judge when using a field is acceptable, so invariably the "rule that is simple enough for even StyleCop to check" becomes "encapsulate fields as properties", "don't use public fields" et cetera...

    0 讨论(0)
  • 2020-12-17 18:58

    Old question, but I think there is a better answer.

    You can report progress using Interlocked.Increment(ref progress) that way you do not have to worry about locking the write operation to progress.

    0 讨论(0)
  • 2020-12-17 19:01

    The usual problem with calling something like count++ from multiple threads (which share the count variable) is that this sequence of events can happen:

    1. Thread A reads the value of count.
    2. Thread B reads the value of count.
    3. Thread A increments its local copy.
    4. Thread B increments its local copy.
    5. Thread A writes the incremented value back to count.
    6. Thread B writes the incremented value back to count.

    This way, the value written by thread A is overwritten by thread B, so the value is actually incremented only once.

    Your code adds locks around operations 1, 2 (get) and 5, 6 (set), but that does nothing to prevent the problematic sequence of events.

    What you need to do is to lock the whole operation, so that while thread A is incrementing the value, thread B can't access it at all:

    lock (progressLock)
    {
        progress.CurrentCount++;
    }
    

    If you know that you will only need incrementing, you could create a method on Progress that encapsulates this.

    0 讨论(0)
  • 2020-12-17 19:12

    The issue here is that ++ is not atomic - one thread can read and increment the value between another thread reading the value and it storing the (now incorrect) incremented value. This is probably compounded by the fact there's a property wrapping your int.

    e.g.

    Thread 1        Thread 2
    reads 5         .
    .               reads 5
    .               writes 6
    writes 6!       .
    

    The locks around the setter and getter don't help this, as there's nothing to stop the lock blocks themseves being called out of order.

    Ordinarily, I'd suggest using Interlocked.Increment, but you can't use this with a property.

    Instead, you could expose _lock and have the lock block be around the progress.CurrentCount++; call.

    0 讨论(0)
  • 2020-12-17 19:13

    Remove lock statements from properties and modify Main body:

     object sync = new object();
            Parallel.ForEach(ids, new ParallelOptions {MaxDegreeOfParallelism = threadCount},
                             id =>
                                 {
                                     lock(sync)
                                     progress.CurrentCount++;
                                 });
    
    0 讨论(0)
  • 2020-12-17 19:20

    It is better to store any database or file system operation in a local buffer variable instead of locking it. locking reduces performance.

    0 讨论(0)
提交回复
热议问题