tpl-dataflow | 易学教程

TPL Dataflow: design for parallelism while keeping order

阅读更多关于 TPL Dataflow: design for parallelism while keeping order

I have never worked with TPL before so I was wondering whether this can be done with it: My application creates a gif image animation file from a lot of frames. I start with a list of Bitmap which represents the frames of the gif file and need to do the following for each frame: paint a number of text/bitmaps onto the frame crop the frame resize the frame reduce the image to 256 colors Obviously this process can be done in parallel for all the frames in the list but for each frame the order of steps needs to be the same. After that, I need to write all the frames to the gif file. Therefore all

How can a TPL Dataflow block downstream get data produced by a source?

阅读更多关于 How can a TPL Dataflow block downstream get data produced by a source?

I'm processing images using TPL Dataflow. I receive a processing request, read an image from a stream, apply several transformations, then write the resulting image to another stream: Request -> Stream -> Image -> Image ... -> Stream For that I use the blocks: BufferBlock<Request> TransformBlock<Request,Stream> TransformBlock<Stream,Image> TransformBlock<Image,Image> TransformBlock<Image,Image> ... writerBlock = new ActionBlock<Image> The problem is the initial Request is what contains some data necessary to create the resulting Stream along with some additional info I need at that point. Do I

Dataflow with splitting work to small jobs and then group again

阅读更多关于 Dataflow with splitting work to small jobs and then group again

I need to do this kind of work: Get Page object from database For each page get all images and process them (IO bound, for example, upload to CDN) If all images proceeded successfully then mark Page as processed in database Since I need to control how much Pages I process in parallel I've decided to go with TPL Dataflows: ____________________________ | Data pipe | | BufferBlock<Page> | | BoundedCapacity = 1 | |____________________________| | ____________________________ | Process images | | TransformBlock<Page, Page> | | BoundedCapacity = 1 | | MaxDegreeOfParallelism = 8 | |___________________

tpl dataflow: fixed buffer size without throwing items away

阅读更多关于 tpl dataflow: fixed buffer size without throwing items away

After playing around with dataflow I encountered a new problem. I would like to limit the inputqueue of all blocks. My producingblock (ActionBlock) is creating 5000 elements really fast and posts them to an broadcastblock. So if i set the BoundedCapacity of the broadcastblock to 100 he throws a lot of data away. But I would prefer the producingblock to wait for new slots in the inputqueue of my bufferblock. Is there any way to get rid of this problem? That's exactly what BufferBlock is for. If you set its BoundedCapacity and it gets full, it will postpone receiving any messages until someone

TPL Dataflow, guarantee completion only when ALL source data blocks completed

阅读更多关于 TPL Dataflow, guarantee completion only when ALL source data blocks completed

问题 How can I re-write the code that the code completes when BOTH transformblocks completed? I thought completion means that it is marked complete AND the " out queue" is empty? public Test() { broadCastBlock = new BroadcastBlock<int>(i => { return i; }); transformBlock1 = new TransformBlock<int, string>(i => { Console.WriteLine("1 input count: " + transformBlock1.InputCount); Thread.Sleep(50); return ("1_" + i); }); transformBlock2 = new TransformBlock<int, string>(i => { Console.WriteLine("2

SingleProducerConstrained and MaxDegreeOfParallelism

阅读更多关于 SingleProducerConstrained and MaxDegreeOfParallelism

问题 In the C# TPL Dataflow library, SingleProducerConstrained is an optimisation option for ActionBlocks you can use when only a single thread is feeding the action block: If a block is only ever going to be used by a single producer at a time, meaning only one thread at a time will be using methods like Post, OfferMessage, and Complete on the block, this property may be set to true to inform the block that it need not apply extra synchronization. What if an ActionBlock is fed using a single

How to mark a TPL dataflow cycle to complete?

阅读更多关于 How to mark a TPL dataflow cycle to complete?

Given the following setup in TPL dataflow. var directory = new DirectoryInfo(@"C:\dev\kortforsyningen_dsm\tiles"); var dirBroadcast=new BroadcastBlock<DirectoryInfo>(dir=>dir); var dirfinder = new TransformManyBlock<DirectoryInfo, DirectoryInfo>((dir) => { return directory.GetDirectories(); }); var tileFilder = new TransformManyBlock<DirectoryInfo, FileInfo>((dir) => { return directory.GetFiles(); }); dirBroadcast.LinkTo(dirfinder); dirBroadcast.LinkTo(tileFilder); dirfinder.LinkTo(dirBroadcast); var block = new XYZTileCombinerBlock<FileInfo>(3, (file) => { var coordinate = file.FullName.Split

TPL Dataflow vs plain Semaphore

阅读更多关于 TPL Dataflow vs plain Semaphore

I have a requirement to make a scalable process. The process has mainly I/O operations with some minor CPU operations (mainly deserializing strings). The process query the database for a list of urls, then fetches data from these urls, deserilize the downloaded data to objects, then persist some of the data into crm dynamics and also to another database. Afterwards I need to update the first database which urls were processed. Part of the requirement is to make the parallelism degree configurable. Initially I thought to implement it via a sequence of tasks with await and limit the parallelism

Benefits of using BufferBlock<T> in dataflow networks

阅读更多关于 Benefits of using BufferBlock in dataflow networks

问题 I was wondering if there are benefits associated with using a BufferBlock linked to one or many ActionBlocks, other than throttling (using BoundedCapacity), instead of just posting directly to ActionBlock(s) (as long as throttling is not required). 回答1: If all you want to do is to forward items from one block to several others, you don't need BufferBlock . But there are certainly cases where it is useful. For example, if you have a complex dataflow network, you might want to build it from

Implementing correct completion of a retryable block

阅读更多关于 Implementing correct completion of a retryable block

问题 Teaser : guys, this question is not about how to implement retry policy. It's about correct completion of a TPL Dataflow block. This question is mostly a continuation of my previous question Retry policy within ITargetBlock. The answer to this question was @svick's smart solution that utilizes TransformBlock (source) and TransformManyBlock (target). The only problem left is to complete this block in a right way : wait for all the retries to be completed first, and then complete the target