Batch PCollection in Beam/Dataflow
问题 I have a PCollection in GCP Dataflow/Apache Beam. Instead of processing it one by one, I need to combine "by N". Something like grouped(N) . So, in case of bounded processing, it will group by 10 items in batch and last batch with whatever left. Is this possible in Apache Beam? 回答1: Edit, looks like: Google Dataflow "elementCountExact" aggregation You should be able to do something similar by assigning elements to global window and using AfterPane.elementCountAtLeast(N) . You still need to