Iterator in a Parallel.ForEach is run on multiple threads

依然范特西╮ 提交于 2019-12-13 16:24:49

问题


I have some code that I run within a Parallel.ForEach loop. The code in the loop is thread safe, but the iterator that I'm using (a custom method with yield return) is not. It appears that the iterator is being run on multiple threads which could potentially cause issues.

Background to the problem

The iterator contains calls to NHibernate (althought that's entirely incidental as you'll see later) and as I was having issues parallelising the code I looked in NHibernate Profiler to see what if that could shine some light on the situation. Part way through it started to report "Using a single session in multiple threads is likely a bug".

Now, according to NHibernate Profiler the offending code was inside the iterator (so it wasn't trying to materialise something elsewhere in the Parallel.ForEach action). So I added some of my own code so that I could detect what NHibernate Profiler was and I saw the same thing.

The iterator method is being called from multiple threads - which I thought was impossible, as others seem to think also, e.g. this other SO answer: https://stackoverflow.com/a/26553810/8152

Simplified demonstration of the problem

To demonstrate the issue (without needing all my extraneous gubbins dealing with NHibernate, etc.) I wrote a simple console app that shows the same problem:

public class Program
{
    public static void Main(string[] args)
    {
        Parallel.ForEach(YieldedNumbers(), (n) => { Thread.Sleep(n); });
        Console.WriteLine("Done!");
        Console.ReadLine();
    }

    public static IEnumerable<int> YieldedNumbers()
    {
        Random rnd = new Random();
        int lastKnownThread = Thread.CurrentThread.ManagedThreadId;
        int detectedSwitches = 0;
        for (int i = 0; i < 1000; i++)
        {
            int currentThread = Thread.CurrentThread.ManagedThreadId;
            if (lastKnownThread != currentThread)
            {
                detectedSwitches++;
                Console.WriteLine(
                    $"{detectedSwitches}: Last known thread ({lastKnownThread}) is not the same as the current thread ({currentThread}).");
                lastKnownThread = currentThread;
            }
            yield return rnd.Next(100,250);
        }
    }
}

Of my test runs, the thread switches between 157 and 174 times in the 1000 iterations. The Sleep simulates the time my action takes.

Summary

Why does Parallel.ForEach do this if the iterator pattern as implemented in .NET is inherently not thread-safe? and; What would be a good solution to be able to get the data, that the iterator currently exposes, in a safe way (on one thread), yet process it on multiple threads? (e.g. Is there any way to force the iterator back onto one thread? Or must the iterator also be thread-safe as well as the action that is invoked for each iteration? Or some other solution entirely?)

Version history

  • Updated the summary to hopefully avoid or reduce the XY Problem of my original question.

回答1:


Is there any way to force the iterator back onto one thread?

No, you have to explicitly handle the condition where multiple threads are calling the iterator. Behind the scenes, if multiple threads are calling IEnumerator.MoveNext(), the iterator will keep advancing as much as it's told to. There is no implicit synchronization happening here.

@JonSkeet blogged a while back about iterating with locking. Although I must say this looks like the XY problem to me, should you be calling NHibernate in parallel from multiple threads? Is it's context thread-safe at all? Those are questions you should consider before going into the wild with thread-safe iterators.



来源:https://stackoverflow.com/questions/39253715/iterator-in-a-parallel-foreach-is-run-on-multiple-threads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!