TPL Dataflow block consumes all available memory

前端 未结 3 583
北恋
北恋 2020-12-20 16:38

I have a TransformManyBlock with the following design:

  • Input: Path to a file
  • Output: IEnumerable of the file\'s contents, one line at a t
3条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-20 17:26

    If output ratio of the pipeline is lower then the post ratio, messages will accumulate on the pipeline until memory runs out or some queue limit is reached. If messages have a significant size, process will be starving for memory soon.

    Setting BoundedCapacity to 1 will cause messages to be rejected by queue if the queue has already one message. That is not the desired behavior in cases like batch processing, for example. Check this post for insights.

    This working test illustrate my point:

    //Change BoundedCapacity to +1 to see it fail
    [TestMethod]
    public void stackOverflow()
    {      
        var total = 1000;
        var processed = 0;
        var block = new ActionBlock(
           (messageUnit) =>
           {
               Thread.Sleep(10);
               Trace.WriteLine($"{messageUnit}");
               processed++;
           },
            new ExecutionDataflowBlockOptions() { BoundedCapacity = -1 } 
       );
    
        for (int i = 0; i < total; i++)
        {
            var result = block.SendAsync(i);
            Assert.IsTrue(result.IsCompleted, $"failed for {i}");
        }
    
        block.Complete();
        block.Completion.Wait();
    
        Assert.AreEqual(total, processed);
    }
    

    So my approach is to throttle the post, so the pipeline will not accumulate much messages in the queues.

    Below a simple way to do it. This way dataflow keeps processing the messages at full speed, but messages are not accumulated, and by doing this avoiding excessive memory consumption.

    //Should be adjusted for specific use.
    public void postAssync(Message message)
    {
    
        while (totalPending = block1.InputCount + ... + blockn.InputCount> 100)
        {
            Thread.Sleep(200);
            //Note: if allocating huge quantities for of memory for each message the Garbage collector may keep up with the pace. 
            //This is the perfect place to force garbage collector to release memory.
    
        }
        block1.SendAssync(message)
    }
    

提交回复
热议问题