We want to build a Cloud Dataflow Streaming pipeline which ingests events from Pubsub and performs multiple ETL-like operations on each individual event. One of these operat
Here's a few things you can do:
DoFn @Setup and @Teardown methods useful).GroupByKey by the device id; then, most of the time, at least with the Cloud Dataflow runner, the same key will be processed by the same worker (though key assignments can change while the pipeline runs, but not too frequently usually). You'll probably want to set a windowing/triggering strategy with immediate triggering though.