Is there an asynchronous version of PLINQ?

烈酒焚心 提交于 2019-12-02 23:13:44

This sounds like a job for Microsoft's reactive framework.

I started with this code as my initial variables:

var items = Enumerable.Range(0, 10).ToArray();

Func<int, bool> Predicate = x => x % 2 == 0;

Func<int, int> ComputeSomeValue = x =>
{
    Thread.Sleep(10000);
    return x * 3;
};

Now, I used regular LINQ query as a base-line:

var results =
    from x in items
    where Predicate(x)
    select ComputeSomeValue(x);

This took 50 seconds to compute the following results:

Then I switched over to an observable (reactive framework) query:

var results =
    from x in items.ToObservable()
    where Predicate(x)
    from y in Observable.Start(() => ComputeSomeValue(x))
    select y;

This took 10 seconds to get:

It's clearly computing in parallel.

However, the results are out of order. So I changed the query to this:

var query =
    from x in items.ToObservable()
    where Predicate(x)
    from y in Observable.Start(() => ComputeSomeValue(x))
    select new { x, y };

var results =
    query
        .ToEnumerable()
        .OrderBy(z => z.x)
        .Select(z => z.y);

That still took 10 seconds, but I got the results back in the correct order.

Now, the only issue here is the WithDegreeOfParallelism. There's a coupe of things to try here.

First up I changed the code to produce 10,000 values with a 10ms compute time. My standard LINQ query still took 50 seconds. But the reactive query took 6.3 seconds. If it could perform all the computations at the same time it should have taken much less. This shows that it is maxing out the asynchronous pipeline.

The second point is that the reactive framework uses schedulers for all of the work scheduling. You could try the variety of schedulers that come with the reactive framework to find an alternative if the built-in one doeesn't do what you want. Or you could even write your own scheduler to do whatever scheduling you like.


Here's a version of the query that computes the predicate in parallel too.

var results =
    from x in items.ToObservable()
    from p in Observable.Start(() => Predicate(x))
    where p
    from y in Observable.Start(() => ComputeSomeValue(x))
    select new { x, y };
turdus-merula

As stated here, PLINQ is for running LINQ queries in parallel on multi-core/multi-processor systems. It hasn't too much to do about cool systems having a lot of disk units and super networking capabilities. AFAIK, it's made for running executable code on more cores, not for concurrently dispatching multiple I/O requests to the operating system.

Maybe your Predicate(x) is CPU bound, therefore you may perform that filtering operation using PLINQ. But you cannot apply the I/O demanding operations (ComputeSomeValue, PerformSomeAction) in the same way.

What you can do is to define a chain of operations (two in your case) for each item (see continuation tasks) and dispatch that chain (sequentially (?)).

Also, you have mentioned something about an "infinite stream of items". This may sound a bit as the producer-consumer problem - if those items are also I/O generated.

Maybe your problem is not that multi-core friendly... It may be just I/O demanding, that's all...

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!