How many RDDs does DStream generate for a batch interval?

前端 未结 3 1326
礼貌的吻别
礼貌的吻别 2021-02-19 20:39

Does one batch interval of data generate one and only one RDD in DStream regardless of how big is the quantity of the data?

3条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-02-19 21:19

    Yes, there is exactly one RDD per batch interval, produced at every batch interval independent of number of records (that are included in the RDD -- there could be zero records inside).

    If there wasn't, and RDD creation was conditioned on the number of elements, you wouldn't have synchronous (micro-batching) streaming, but rather a form of asynchronous processing.

提交回复
热议问题