Stateful processing in Beam - is state shared across window panes?

老子叫甜甜 提交于 2019-12-04 16:36:41

Your triggering configuration does not affect how stateful processing of a ParDo proceeds. The elements are provided immediately to your DoFn without any buffering/triggering and your DoFn directly controls when output occurs.

The fact that you control the output is an important difference between stateful ParDo processing and Combine.perKey governed by triggers. This is why stateful ParDo is often a good choice when triggers are not rich enough for your use case.

I compare stateful ParDo processing with Combine + triggers in some more detail in my post on the Beam blog: https://beam.apache.org/blog/2017/02/13/stateful-processing.html

Now, if there is a GroupByKey or Combine.perKey somewhere upstream from your stateful ParDo, then input elements will be associated with some trigger firing from upstream. But this does not affect how the state for your stateful ParDo is managed. As state is persisted across elements, and a "pane" is just an element, state is maintained until the window expires fully.

Very nice summary leading up to your question, by the way!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!