Forcing an empty pane/window in streaming in Apache Beam

人走茶凉 提交于 2019-12-04 02:03:41

问题


I am trying to implement a pipeline and takes in a stream of data and every minutes output a True if there is any element in the minute interval or False if there is none. The pane (with forever time trigger) or window (fixed window) does not seem to trigger if there is no element for the duration.

One workaround I am thinking is to put the stream into a global window, use a ValueState to keep a queue to accumulate the data and a timer as a trigger to exam the queue. I wonder if there is any neater way of achieving this.

Thanks.


回答1:


I think your timers and state solution is a good way to do this. However, keep in mind that your timers will not be set until you receive at least one element for a key.

If this is an issue, then the other thing you could do is inject a PCollection so that every window is guaranteed to have at least one dummy element. Then you can use ValueState to check if any element besides the dummy element has arrived. Or alternatively use Count.PerElement over the window and check if there is more than 1 element(An additional element, that is not the dummy element) for that window.




回答2:


I believe you can achieve this behaviour by setting

.withAllowedLateness(Duration.ZERO, Window.ClosingBehavior.FIRE_ALWAYS)

in your windowing step.



来源:https://stackoverflow.com/questions/45445096/forcing-an-empty-pane-window-in-streaming-in-apache-beam

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!