Apache beam windowing: consider late data but emit only one pane

后端 未结 2 511
南旧
南旧 2020-12-21 09:18

I would like to emit a single pane when the watermark reaches x minutes past the end of the window. This let\'s me ensure I handle some late data, but still only emit one pa

相关标签:
2条回答
  • 2020-12-21 09:36

    What I would do is, first, to set Window.ClosingBehavior to FIRE_ALWAYS. This way, when the window is permanently closed it will send a final pane (even if there are no late records since the last pane) with PaneInfo.isLast set to true.

    Then, I would proceed with the second option:

    I could emit the pane at the end of the window and then again when I receive late data, however in this case I am not emitting a single pane.

    But discarding downstream the panes that are not final with something like:

    public void processElement(ProcessContext c) {
        if (c.pane().isLast) {
            c.output(c.element());
        }
    }
    
    0 讨论(0)
  • 2020-12-21 09:39

    Thanks Guillem, in the end I used your answer to find this very useful link with lots of apache beam examples. From this I came up with the following solution:

     // We first specify to never emit any panes
     .triggering(Never.ever())
    
     // We then specify to fire always when closing the window. This will emit a
     // single final pane at the end of allowedLateness
     .withAllowedLateness(allowedLateness, Window.ClosingBehavior.FIRE_ALWAYS)
     .discardingFiredPanes())
    
    0 讨论(0)
提交回复
热议问题