Empty output for Watermarked Aggregation Query in Append Mode

后端 未结 2 1662
小蘑菇
小蘑菇 2021-02-01 23:02

I use Spark 2.2.0-rc1.

I\'ve got a Kafka topic which I\'m querying a running watermarked aggregation, with a 1 minute watermark, giving out to

2条回答
  •  渐次进展
    2021-02-02 00:06

    Here's my best guess:

    Append mode only outputs the data after the watermark has passed (e.g. in this case 1 minute later). You didn't set a trigger (e.g. .trigger(Trigger.ProcessingTime("10 seconds")) so by default it outputs batches as fast as possible. So for the first minute all your batches should be empty, and the first batch after a minute should contain some content.

    Another possibility is that you're using groupBy("time") instead of groupBy(window("time", "[window duration]")). I believe watermarks are meant to be used with time windows or mapGroupsWithState, so I'm not how the interaction works in this case.

提交回复
热议问题