tpl-dataflow

Entity Framework and Parallelism

梦想与她 提交于 2019-11-30 12:05:47
问题 Background I have an application that receives periodic data dumps (XML files) and imports them into an existing database using Entity Framework 5 (Code First). The import happens via EF5 rather than say BULK INSERT or BCP because business rules that already exist in the entities must be applied. Processing seems to be CPU bound in the application itself (the extremely fast, write-cache enabled disk IO subsystem shows almost zero disk wait time throughout the process, and SQL Server shows no

Retry policy within ITargetBlock<TInput>

本秂侑毒 提交于 2019-11-30 07:10:04
I need to introduce a retry policy to the workflow. Let's say there are 3 blocks that are connected in such a way: var executionOptions = new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 3 }; var buffer = new BufferBlock<int>(); var processing = new TransformBlock<int, int>(..., executionOptions); var send = new ActionBlock<int>(...); buffer.LinkTo(processing); processing.LinkTo(send); So there is a buffer which accumulates data, then send it to the transform block that processes not more that 3 items at one time, and then the result send to the action block. Potentially during

Entity Framework and Parallelism

自古美人都是妖i 提交于 2019-11-30 02:04:32
Background I have an application that receives periodic data dumps (XML files) and imports them into an existing database using Entity Framework 5 (Code First). The import happens via EF5 rather than say BULK INSERT or BCP because business rules that already exist in the entities must be applied. Processing seems to be CPU bound in the application itself (the extremely fast, write-cache enabled disk IO subsystem shows almost zero disk wait time throughout the process, and SQL Server shows no more than 8%-10% CPU time). To improve efficiency, I built a pipeline using TPL Dataflow with

TPL Dataflow, how to forward items to only one specific target block among many linked target blocks?

自作多情 提交于 2019-11-29 09:09:39
I am looking for a TPL data flow block solution which can hold more than a single item, which can link to multiple target blocks, but which has the ability to forward an item to only a specific target block that passes a filter/predicate. At no time should an item be delivered to multiple target blocks at the same time, always only to the one which matches the filter or the item can be discarded. I am not fond of BroadCastBlock because, if I understand correctly, it does not guarantee delivery (or does it?) and the filtering is done on the target block side, meaning BroadCastBlock essentially

Retry policy within ITargetBlock<TInput>

蓝咒 提交于 2019-11-29 08:39:22
问题 I need to introduce a retry policy to the workflow. Let's say there are 3 blocks that are connected in such a way: var executionOptions = new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 3 }; var buffer = new BufferBlock<int>(); var processing = new TransformBlock<int, int>(..., executionOptions); var send = new ActionBlock<int>(...); buffer.LinkTo(processing); processing.LinkTo(send); So there is a buffer which accumulates data, then send it to the transform block that processes

Hashed/Sharded ActionBlocks

房东的猫 提交于 2019-11-29 01:13:42
问题 I have a constant flow of certain items that I need to process in parallel so I'm using TPL Dataflow . The catch is that the items that share the same key (similar to a Dictionary) should be processed in a FIFO order and not be parallel to each other (they can be parallel to other items with different values). The work being done is very CPU bound with minimal asynchronous locks so my solution was to create an array of ActionBlock<T> s the size of Environment.ProcessorCount with no

TPL Complete vs Completion

半世苍凉 提交于 2019-11-28 11:36:49
I need to import customer related data from legacy DB and perform several transformations during the process. This means a single entry needs to perform additional "events" (synchronize products, create invoices, etc.). My initial solution was a simple parallel approach. It works okay, but sometimes it has issues. If the currently processed customers need to wait for the same type of events, their processing queues might got stuck and eventually time out, causing every underlying events to fail too (they depend on the one which failed). It doesn't happen all the time, yet it's annoying. So I

TPL Dataflow, how to forward items to only one specific target block among many linked target blocks?

有些话、适合烂在心里 提交于 2019-11-28 02:31:51
问题 I am looking for a TPL data flow block solution which can hold more than a single item, which can link to multiple target blocks, but which has the ability to forward an item to only a specific target block that passes a filter/predicate. At no time should an item be delivered to multiple target blocks at the same time, always only to the one which matches the filter or the item can be discarded. I am not fond of BroadCastBlock because, if I understand correctly, it does not guarantee

TPL Dataflow, whats the functional difference between Post() and SendAsync()?

孤街醉人 提交于 2019-11-27 19:58:30
I am confused about the difference between sending items through Post() or SendAsync(). My understanding is that in all cases once an item reached the input buffer of a data block, control is returned to the calling context, correct? Then why would I ever need SendAsync? If my assumption is incorrect then I wonder, on the contrary, why anyone would ever use Post() if the whole idea of using data blocks is to establish a concurrent and async environment. I understand of course the difference technically in that Post() returns a bool whereas SendAsync returns an awaitable Task of bool. But what

BroadcastBlock with Guaranteed Delivery in TPL Dataflow

拈花ヽ惹草 提交于 2019-11-27 16:42:59
问题 I have a stream of data that I process in several different ways... so I would like to send a copy of each message I get to multiple targets so that these targets may execute in parallel... however, I need to set BoundedCapacity on my blocks because the data is streamed in way faster than my targets can handle them and there is a ton of data. Without BoundedCapacity I would quickly run out of memory. However the problem is BroadcastBlock will drop messages if a target cannot handle it (due to