Spark Streaming: Could not compute split, block not found

前端 未结 3 432
感动是毒
感动是毒 2021-01-01 23:59

I am trying to use Spark Streaming with Kafka (version 1.1.0) but the Spark job keeps crashing due to this error:

14/11/21 12:39:23 ERROR TaskSetManager: Tas         


        
3条回答
  •  滥情空心
    2021-01-02 00:24

    It is due to Spark streaming model. It collects the data for a batch interval and sends it off for processing to spark engine. Spark engine is not aware that it is coming from a streaming system and it doesn't communicate it back to streaming component.

    This means there is no flow control (backpressure control) unlike in native streaming systems like Storm or Flink which can nicely smoothen the spout/source flow based on processing rate.

    From https://spark.apache.org/docs/latest/streaming-programming-guide.html

    One option to work around this would be to manually pass in processing info/Ack back to Receiver component - of course this also means we need to use a custom receiver. At this point we are starting to build features Storm/Flink etc. are providing out of the box.

提交回复
热议问题