Tasks combine result and continue

一曲冷凌霜 提交于 2021-02-11 12:41:01

问题


I have 16 tasks doing the same job, each of them return an array. I want to combine the results in pairs and do same job until I have only one task. I don't know what is the best way to do this.

public static IComparatorNetwork[] Prune(IComparatorNetwork[] nets, int numTasks)
    {
        var tasks = new Task[numTasks];
        var netsPerTask = nets.Length/numTasks;
        var start = 0;
        var concurrentSet = new ConcurrentBag<IComparatorNetwork>();
        
        for(var i = 0; i  < numTasks; i++)
        {
            IComparatorNetwork[] taskNets;
            if (i == numTasks - 1)
            {
                taskNets = nets.Skip(start).ToArray();                 
            }
            else
            {
                taskNets = nets.Skip(start).Take(netsPerTask).ToArray();
            }

            start += netsPerTask;
            tasks[i] = Task.Factory.StartNew(() =>
            {
                var pruner = new Pruner();
                concurrentSet.AddRange(pruner.Prune(taskNets));
            });
        }

        Task.WaitAll(tasks.ToArray());

        if(numTasks > 1)
        {
            return Prune(concurrentSet.ToArray(), numTasks/2);
        }

        return concurrentSet.ToArray();
    }

Right now I am waiting for all tasks to complete then I repeat with half of the tasks until I have only one. I would like to not have to wait for all on each iteration. I am very new with parallel programming probably the approach is bad. The code I am trying to parallelize is the following:

public IComparatorNetwork[] Prune(IComparatorNetwork[] nets)
    {
        var result = new List<IComparatorNetwork>();

        for (var i = 0; i < nets.Length; i++) 
        {
            var isSubsumed = false;

            for (var index = result.Count - 1; index >= 0; index--)
            {
                var n = result[index];

                if (nets[i].IsSubsumed(n))
                {
                    isSubsumed = true;
                    break;
                }

                if (n.IsSubsumed(nets[i]))
                {
                    result.Remove(n);
                }
            }

            if (!isSubsumed) 
            {
                result.Add(nets[i]);
            }
        }

        return result.ToArray();
    }`

回答1:


So what you're fundamentally doing here is aggregating values, but in parallel. Fortunately, PLINQ already has an implementation of Aggregate that works in parallel. So in your case you can simply wrap each element in the original array in its own one element array, and then your Prune operation is able to combine any two arrays of nets into a new single array.

public static IComparatorNetwork[] Prune(IComparatorNetwork[] nets)
{
    return nets.Select(net => new[] { net })
        .AsParallel()
        .Aggregate((a, b) => new Pruner().Prune(a.Concat(b).ToArray()));
}

I'm not super knowledgeable about the internals of their aggregate method, but I would imagine it's likely pretty good and doesn't spend a lot of time waiting unnecessarily. But, if you want to write your own, so that you can be sure the workers are always pulling in new work as soon as their is new work, here is my own implementation. Feel free to compare the two in your specific situation to see which performs best for your needs. Note that PLINQ is configurable in many ways, feel free to experiment with other configurations to see what works best for your situation.

public static T AggregateInParallel<T>(this IEnumerable<T> values, Func<T, T, T> function, int numTasks)
{
    Queue<T> queue = new Queue<T>();
    foreach (var value in values)
        queue.Enqueue(value);
    if (!queue.Any())
        return default(T);  //Consider throwing or doing something else here if the sequence is empty

    (T, T)? GetFromQueue()
    {
        lock (queue)
        {
            if (queue.Count >= 2)
            {
                return (queue.Dequeue(), queue.Dequeue());
            }
            else
            {
                return null;
            }
        }
    }

    var tasks = Enumerable.Range(0, numTasks)
        .Select(_ => Task.Run(() =>
        {
            var pair = GetFromQueue();
            while (pair != null)
            {
                var result = function(pair.Value.Item1, pair.Value.Item2);
                lock (queue)
                {
                    queue.Enqueue(result);
                }
                pair = GetFromQueue();
            }
        }))
        .ToArray();
    Task.WaitAll(tasks);
    return queue.Dequeue();
}

And the calling code for this version would look like:

public static IComparatorNetwork[] Prune2(IComparatorNetwork[] nets)
{
    return nets.Select(net => new[] { net })
        .AggregateInParallel((a, b) => new Pruner().Prune(a.Concat(b).ToArray()), nets.Length / 2);
}

As mentioned in comments, you can make the pruner's Prune method much more efficient by having it accept two collections, not just one, and only comparing items from each collection with the other, knowing that all items from the same collection will not subsume any others from that collection. This makes the method not only much shorter, simpler, and easier to understand, but also removes a sizeable portion of the expensive comparisons. A few minor adaptations can also greatly reduce the number of intermediate collections created.

public static IReadOnlyList<IComparatorNetwork> Prune(IReadOnlyList<IComparatorNetwork> first, IReadOnlyList<IComparatorNetwork> second)
{
    var firstItemsNotSubsumed = first.Where(outerNet => !second.Any(innerNet => outerNet.IsSubsumed(innerNet)));
    var secondItemsNotSubsumed = second.Where(outerNet => !first.Any(innerNet => outerNet.IsSubsumed(innerNet)));
    return firstItemsNotSubsumed.Concat(secondItemsNotSubsumed).ToList();
}

With the the calling code just needs minor adaptations to ensure the types match up and that you pass in both collections rather than concatting them first.

public static IReadOnlyList<IComparatorNetwork> Prune(IReadOnlyList<IComparatorNetwork> nets)
{
    return nets.Select(net => (IReadOnlyList<IComparatorNetwork>)new[] { net })
        .AggregateInParallel((a, b) => Pruner.Prune(a, b), nets.Count / 2);
}


来源:https://stackoverflow.com/questions/64265723/tasks-combine-result-and-continue

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!