TPL Dataflow Speedup?

前端 未结 4 1736
难免孤独
难免孤独 2021-01-01 02:55

I wonder whether the following code can be optimized to execute faster. I currently seem to max out at around 1.4 million simple messages per second on a pretty simple data

4条回答
  •  暗喜
    暗喜 (楼主)
    2021-01-01 03:18

    If your workload is so granular that you expect to process millions of messages per second, then passing individual messages through the pipeline becomes not viable because of the associated overhead. You'll need to chunkify the workload by batching the messages to arrays or lists. For example:

    var transform = new TransformBlock(batch =>
    {
        var results = new string[batch.Length];
        for (int i = 0; i < batch.Length; i++)
        {
            results[i] = ProcessItem(batch[i]);
        }
        return results;
    });
    

    For batching your input you could use a BatchBlock, or the "linqy" Buffer extension method from the System.Interactive package, or the similar in functionality Batch method from the MoreLinq package, or do it manually.

提交回复
热议问题