问题
I have some code that I run within a Parallel.ForEach
loop. The code in the loop is thread safe, but the iterator that I'm using (a custom method with yield return
) is not. It appears that the iterator is being run on multiple threads which could potentially cause issues.
Background to the problem
The iterator contains calls to NHibernate (althought that's entirely incidental as you'll see later) and as I was having issues parallelising the code I looked in NHibernate Profiler to see what if that could shine some light on the situation. Part way through it started to report "Using a single session in multiple threads is likely a bug".
Now, according to NHibernate Profiler the offending code was inside the iterator (so it wasn't trying to materialise something elsewhere in the Parallel.ForEach
action). So I added some of my own code so that I could detect what NHibernate Profiler was and I saw the same thing.
The iterator method is being called from multiple threads - which I thought was impossible, as others seem to think also, e.g. this other SO answer: https://stackoverflow.com/a/26553810/8152
Simplified demonstration of the problem
To demonstrate the issue (without needing all my extraneous gubbins dealing with NHibernate, etc.) I wrote a simple console app that shows the same problem:
public class Program
{
public static void Main(string[] args)
{
Parallel.ForEach(YieldedNumbers(), (n) => { Thread.Sleep(n); });
Console.WriteLine("Done!");
Console.ReadLine();
}
public static IEnumerable<int> YieldedNumbers()
{
Random rnd = new Random();
int lastKnownThread = Thread.CurrentThread.ManagedThreadId;
int detectedSwitches = 0;
for (int i = 0; i < 1000; i++)
{
int currentThread = Thread.CurrentThread.ManagedThreadId;
if (lastKnownThread != currentThread)
{
detectedSwitches++;
Console.WriteLine(
$"{detectedSwitches}: Last known thread ({lastKnownThread}) is not the same as the current thread ({currentThread}).");
lastKnownThread = currentThread;
}
yield return rnd.Next(100,250);
}
}
}
Of my test runs, the thread switches between 157 and 174 times in the 1000 iterations. The Sleep
simulates the time my action takes.
Summary
Why does Parallel.ForEach
do this if the iterator pattern as implemented in .NET is inherently not thread-safe? and; What would be a good solution to be able to get the data, that the iterator currently exposes, in a safe way (on one thread), yet process it on multiple threads? (e.g. Is there any way to force the iterator back onto one thread? Or must the iterator also be thread-safe as well as the action that is invoked for each iteration? Or some other solution entirely?)
Version history
- Updated the summary to hopefully avoid or reduce the XY Problem of my original question.
回答1:
Is there any way to force the iterator back onto one thread?
No, you have to explicitly handle the condition where multiple threads are calling the iterator. Behind the scenes, if multiple threads are calling IEnumerator.MoveNext()
, the iterator will keep advancing as much as it's told to. There is no implicit synchronization happening here.
@JonSkeet blogged a while back about iterating with locking. Although I must say this looks like the XY problem to me, should you be calling NHibernate
in parallel from multiple threads? Is it's context thread-safe at all? Those are questions you should consider before going into the wild with thread-safe iterators.
来源:https://stackoverflow.com/questions/39253715/iterator-in-a-parallel-foreach-is-run-on-multiple-threads