Rx How to parallelize long-running task?

邮差的信 提交于 2019-12-08 10:45:38

问题


I have the following snippet that enumerates elements of some xml (read from the output of a svn log --xml ... process) then runs a long-running method for each xml element.

var proc = Process.Start(svnProcInfo);
var xml = XDocument.Load(proc.StandardOutput);

var xElements = xml.Descendants("path")
                   .ToObservable()
                   //.SubscribeOn(ThreadPoolScheduler.Instance) 
                   .Select(descendant => return LongRunning(descendant));
xElements
    //.SubscribeOn(NewThreadScheduler.Default)
    .Subscribe(result => Console.WriteLine(result);

Console.ReadKey();

The LongRunning method isn't important, but inside it I log the thread it runs on. Let's assume it runs for one whole second.

My problem is, un-commenting either SubscribeOn() line has no effect whatsoever. The calls to LongRunning are sequential and happening every one second, on the same thread (although a different thread than the main (initial) thread).

This is a console application.

I'm new at Rx. What am I missing?

EDIT:

After trying Lee Campbell's answer, I noticed another problem.

Console.Error.WriteLine("Main thread " + Thread.CurrentThread.ManagedThreadId);

var xElements = xml.Descendants("path").ToObservable()
    //.ObserveOn(Scheduler.CurrentThread)
    .SelectMany(descendant =>     
          Observable.Start(()=>LongRunning(descendant),NewThreadScheduler.Default))
    .Subscribe(result => Console.WriteLine(
         "Result on: " + Thread.CurrentThread.ManagedThreadId));

[...]

string LongRunning(XElement el)
{
    Console.WriteLine("Execute on: Thread " + Thread.CurrentThread.ManagedThreadId);
    DoWork();
    Console.WriteLine("Finished on Thread " + Thread.CurrentThread.ManagedThreadId);
    return "something";
}

This gives the following output:

Main thread 1
Execute on: Thread 3
Execute on: Thread 4
Execute on: Thread 5
Execute on: Thread 6
Execute on: Thread 7
Finished on Thread 5
Finished on Thread 6
Result on: 5
Result on: 6
Finished on Thread 7
Result on: 7
Finished on Thread 3
Result on: 3
Finished on Thread 4
Result on: 4
Done! Press any key...

What I need is a way to "queue" the results to the same thread. I thought that's what ObserveOn() is for, but un-commenting the ObserveOn() line above doesn't change the results.


回答1:


Firstly, Rx is a library (or paradigm) for controlling asynchrony, specifically observable sequences. What you have here is a enumerable sequence (the Xml Descendants) and a blocking/synchronous LongRunning method call.

By calling ToObservable() on your enumerable sequence, you are really only complying with the interface, but as your sequence is realized (eager not lazy), there is nothing really Observable/Async about it.

By calling SubscribeOn, you had the right idea, but the conversion has been done already in the ToObservable() operator. You probably meant to call ToObservable(ThreadPoolScheduler.Instance) so that any slow iteration of the IEnumerable can be done on the other thread. However...I think this will not be a slow iterator, so this probably doesn't solve anything.

What you most likely want to do (which is dubious if Rx is best tool for this type of problem) is to schedule the call to LongRunning method. However this means you will need to add Asyncrony to your select. A great way to do this is one of the Rx Factory methods like Observable.FromAsync or Observable.Start. This will however make your sequence an IObservable<IObservable<T>>. You can flatten it by using SelectMany or Merge.

Having said all this, what I think you want to do is:

var proc = Process.Start(avnProcInfo);
var xml = XDocument.Load(proc.StandardOutput);

//EDIT: Added ELS to serialise results onto a single thread.
var els = new EventLoopScheduler(threadStart=>new Thread(threadStart)
    {
        IsBackground=true, 
        Name="MyEventLoopSchedulerThread"
    });

var xElements = xml.Descendants("path").ToObservable()
                .SelectMany(descendant => Observable.Start(()=>LongRunning(descendant),ThreadPoolScheduler.Instance))
                .ObserveOn(els)
                .Subscribe(result => Console.WriteLine(result));

Console.ReadKey();


来源:https://stackoverflow.com/questions/19009569/rx-how-to-parallelize-long-running-task

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!