Multiple aggregations in Spark Structured Streaming

前端 未结 8 1786
难免孤独
难免孤独 2020-12-09 04:09

I would like to do multiple aggregations in Spark Structured Streaming.

Something like this:

  • Read a stream of input files (from a folder)
  • Per
8条回答
  •  一个人的身影
    2020-12-09 04:32

    Multiple aggregates in Spark Structured streaming is not supported as of Spark 2.4. Supporting this can be tricky esp. with event time in "update" mode since the aggregate output could change with late events. Its much straightforward to support this in "append" mode however spark does not support true watermarks yet.

    Heres a proposal to add it in "append" mode - https://github.com/apache/spark/pull/23576

    If interested you can watch the PR and post your votes there.

提交回复
热议问题