'Timely and stateful' processing possible with Apache Beam Java using Dataflow runner?

丶灬走出姿态 提交于 2019-12-11 05:36:33

问题


I'm trying to evaluate using Apache Beam (Java SDK) (specifically for Google Cloud's Dataflow runner) for a somewhat complex state-machine workflow.

Specifically I want to take advantage of stateful processing and timers as explained in this blogpost:

https://beam.apache.org/blog/2017/08/28/timely-processing.html

Looking at the capabilities matrix page for Dataflow it says:

  • Timers: "Dataflow supports timers in non-merging windows". Ok that's fine.
  • Stateful processing:
    • "State is supported for non-merging windows". Ok fine.
    • SetState and MapState are not yet supported." Hmm...That sounds like an issue. I'm unclear what IS supported though, and if SetState and MapState are needed for the approach in the blogpost.

So my question is: can I achieve the 'timely and stateful processing' approach explained in the blogpost on Dataflow? Are the required SDK features currently supported on Dataflow or perhaps coming soon?

Thanks in advance for any help

(The blogpost says to check the capability matrix which I've done... but as I'm just starting to evaluate Beam/Dataflow I'm unable to figure out if it's possible to do 'timely and stateful processing' using Dataflow as the runner.)

来源:https://stackoverflow.com/questions/49650594/timely-and-stateful-processing-possible-with-apache-beam-java-using-dataflow-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!