TPL dataflow feedback loop

吃可爱长大的小学妹 提交于 2021-02-08 11:44:19

问题


The pipeline takes absolute path of file (in Block 1) and processes it and saves it to database (in block3).

The constraint is that certain file types (.vde) depend on a particular parent file type (.vd). The parent file type has the metadata required to process the dependent file types. Unless the parent file type is present in the system, I cannot process the dependent file type.

Objective - I need the system to somehow wait for the parent file type to enter the system and update the state. Then the dependent file types are automatically called upon again and processed.

My approach - Add a feedback loop (block 4) which is linked to block 1. However, I end up losing the messages. The messages coming from block 4 to block 1 are not able to reach block 3 and are lost in between somewhere in the pipeline.

What can I do to not lose the messages coming from block 4 to block 1?

Or can this be done in a better way?


回答1:


As an exercise I attempted to make a JoinDependencyBlock similar to the JoinBlock<T1,T2>, that propagates matching elements from two buffers.


Update: I came up with a simpler implementation, that uses internally three build-in blocks, two ActionBlocks for input and one BufferBlock for output. Each action block populates a dedicated List, and when an element is added the two lists are searched for matched pairs. If one is found it is posted to the BufferBlock. The main complexity is at the linking of these blocks, because the success and failure cases need different handling.

When the JoinDependencyBlock is completed, any unmatched elements in the internal lists are discarded.

public class JoinDependencyBlock<T1, T2> : ISourceBlock<(T1, T2)>
{
    private readonly Func<T1, T2, bool> _matchPredicate;
    private readonly List<T1> _list1 = new List<T1>();
    private readonly List<T2> _list2 = new List<T2>();
    private readonly ActionBlock<T1> _input1;
    private readonly ActionBlock<T2> _input2;
    private readonly BufferBlock<(T1, T2)> _output;
    private readonly object _locker = new object();

    public JoinDependencyBlock(Func<T1, T2, bool> matchPredicate,
        CancellationToken cancellationToken)
    {
        _matchPredicate = matchPredicate
            ?? throw new ArgumentNullException(nameof(matchPredicate));

        // Create the three internal blocks
        var options = new ExecutionDataflowBlockOptions()
        {
            CancellationToken = cancellationToken
        };
        _input1 = new ActionBlock<T1>(Add1, options);
        _input2 = new ActionBlock<T2>(Add2, options);
        _output = new BufferBlock<(T1, T2)>(options);

        // Link the input blocks with the output block
        var inputTasks = new Task[] { _input1.Completion, _input2.Completion };
        Task.WhenAny(inputTasks).Unwrap().ContinueWith(t =>
        {
            // If ANY input block fails, then the whole block has failed
            ((IDataflowBlock)_output).Fault(t.Exception.InnerException);
            if (!_input1.Completion.IsCompleted) _input1.Complete();
            if (!_input2.Completion.IsCompleted) _input2.Complete();
            ClearLists();
        }, default, TaskContinuationOptions.OnlyOnFaulted |
            TaskContinuationOptions.RunContinuationsAsynchronously,
            TaskScheduler.Default);
        Task.WhenAll(inputTasks).ContinueWith(t =>
        {
            // If ALL input blocks succeeded, then the whole block has succeeded
            _output.Complete();
            ClearLists();
        }, default, TaskContinuationOptions.NotOnFaulted |
            TaskContinuationOptions.RunContinuationsAsynchronously,
            TaskScheduler.Default);
    }

    public JoinDependencyBlock(Func<T1, T2, bool> matchPredicate)
        : this(matchPredicate, CancellationToken.None) { }

    public ITargetBlock<T1> Target1 => _input1;
    public ITargetBlock<T2> Target2 => _input2;
    public Task Completion => _output.Completion;

    private void Add1(T1 value1)
    {
        T2 value2;
        lock (_locker)
        {
            var index = _list2.FindIndex(v => _matchPredicate(value1, v));
            if (index < 0)
            {
                // Match not found
                _list1.Add(value1);
                return;
            }
            value2 = _list2[index];
            _list2.RemoveAt(index);
        }
        _output.Post((value1, value2));
    }

    private void Add2(T2 value2)
    {
        T1 value1;
        lock (_locker)
        {
            var index = _list1.FindIndex(v => _matchPredicate(v, value2));
            if (index < 0)
            {
                // Match not found
                _list2.Add(value2);
                return;
            }
            value1 = _list1[index];
            _list1.RemoveAt(index);
        }
        _output.Post((value1, value2));
    }

    private void ClearLists()
    {
        lock (_locker)
        {
            _list1.Clear();
            _list2.Clear();
        }
    }

    public void Complete() => _output.Complete();

    public void Fault(Exception exception)
        => ((IDataflowBlock)_output).Fault(exception);

    public IDisposable LinkTo(ITargetBlock<(T1, T2)> target,
        DataflowLinkOptions linkOptions)
        => _output.LinkTo(target, linkOptions);

    (T1, T2) ISourceBlock<(T1, T2)>.ConsumeMessage(
        DataflowMessageHeader messageHeader, ITargetBlock<(T1, T2)> target,
        out bool messageConsumed)
        => ((ISourceBlock<(T1, T2)>)_output).ConsumeMessage(
            messageHeader, target, out messageConsumed);

    void ISourceBlock<(T1, T2)>.ReleaseReservation(
        DataflowMessageHeader messageHeader, ITargetBlock<(T1, T2)> target)
        => ((ISourceBlock<(T1, T2)>)_output).ReleaseReservation(
            messageHeader, target);

    bool ISourceBlock<(T1, T2)>.ReserveMessage(
        DataflowMessageHeader messageHeader, ITargetBlock<(T1, T2)> target)
        => ((ISourceBlock<(T1, T2)>)_output).ReserveMessage(
            messageHeader, target);
}

Usage example:

var joinBlock = new JoinDependencyBlock<FileInfo, FileInfo>((fi1, fi2) =>
{
    // Check if the files are matched
    var name1 = Path.GetFileNameWithoutExtension(fi1.Name);
    var name2 = Path.GetFileNameWithoutExtension(fi2.Name);
    return StringComparer.OrdinalIgnoreCase.Equals(name1, name2);
});
var actionBlock = new ActionBlock<(FileInfo, FileInfo)>(pair =>
{
    // Process the matching files
    Console.WriteLine(pair.Item1.Name + " :: " +  pair.Item2.Name);
});
joinBlock.LinkTo(actionBlock);


来源:https://stackoverflow.com/questions/57985588/tpl-dataflow-feedback-loop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!