Parallel.For(): Update variable outside of loop

拜拜、爱过 提交于 2019-11-27 03:21:06
Ade Miller

You can't do this. sum is being shared across you parallel threads. You need to make sure that the sum variable is only being accessed by one thread at a time:

// DON'T DO THIS!
Parallel.For(0, data.Count, i =>
{
    Interlocked.Add(ref sum, data[i]);
});

BUT... This is an anti-pattern because you've effectively serialised the loop because each thread will lock on the Interlocked.Add.

What you need to do is add sub totals and merge them at the end like this:

Parallel.For<int>(0, result.Count, () => 0, (i, loop, subtotal) =>
    {
        subtotal += result[i];
        return subtotal;
    },
    (x) => Interlocked.Add(ref sum, x)
);

You can find further discussion of this on MSDN: http://msdn.microsoft.com/en-us/library/dd460703.aspx

PLUG: You can find more on this in Chapter 2 on A Guide to Parallel Programming

The following is also definitely worth a read...

Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4 - Stephen Toub

sum += y; is actually sum = sum + y;. You are getting incorrect results because of the following race condition:

  1. Thread1 reads sum
  2. Thread2 reads sum
  3. Thread1 calculates sum+y1, and stores the result in sum
  4. Thread2 calculates sum+y2, and stores the result in sum

sum is now equal to sum+y2, instead of sum+y1+y2.

Your surmise is correct.

When you write sum += y, the runtime does the following:

  1. Read the field onto the stack
  2. Add y to the stack
  3. Write the result back to the field

If two threads read the field at the same time, the change made by the first thread will be overwritten by the second thread.

You need to use Interlocked.Add, which performs the addition as a single atomic operation.

Incrementing a long isn't an atomic operation.

I think it's important to distinguish that this loop is not capable of being partitioned for parallelism, because as has been mentioned above each iteration of the loop is dependent on the prior. The parallel for is designed for explicitly parallel tasks, such as pixel scaling etc. because each iteration of the loop cannot have data dependencies outside its iteration.

Parallel.For(0, input.length, x =>
{
    output[x] = input[x] * scalingFactor;
});

The above an example of code that allows for easy partitioning for parallelism. However a word of warning, parallelism comes with a cost, even the loop I used as an example above is far far too simple to bother with a parallel for because the set up time takes longer than the time saved via parallelism.

An important point no-one seems to have mentioned: For data-parallel operations (such as the OP's), it is often better (in terms of both efficiency and simplicity) to use PLINQ instead of the Parallel class. The OP's code is actually trivial to parallelize:

long sum = Enumerable.Range(1, 10000).AsParallel().Sum();

The above snippet uses the ParallelEnumerable.Sum method, although one could also use Aggregate for more general scenarios. Refer to the Parallel Loops chapter for an explanation of these approaches.

user347918

if there are two parameters in this code. For example

long sum1 = 0;
long sum2 = 0;

Parallel.For(1, 10000, y =>
    {
        sum1 += y;
        sum2=sum1*y;
    }
);

what will we do ? i am guessing that have to use array !

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!