According to release notes of dataflow 2.X, IntraBundleParallelization is removed. Is there a way to control/increase parallelism of DoFns on dataflow 2.1.0 ?
I was get
It was removed because its implementation keeps a handle on the ProcessContext of a ProcessElement call after the call completes, and this is unsafe and not guaranteed to work.
However, I agree that it was a useful abstraction, and it is unfortunate that we don't have a replacement yet.
As a workaround, you can try the following:
@Setup, create an Executor with the needed number of threads@StartBundle, create an ExecutorCompletionService wrapping the executor@ProcessElement, submit a Future to it representing the result of processing the element@ProcessElement, also poll() the CompletionService for completed futures and output their results@FinishBundle, wait for all remaining futures to complete, output their results, and shut down the CompletionService.Remember to not use the ProcessContext in your futures. ProcessContext can only be used from the current thread and from within the current ProcessElement call.