How to get the number of elements in partition?

后端 未结 3 760
清歌不尽
清歌不尽 2020-12-05 11:14

Is there any way to get the number of elements in a spark RDD partition, given the partition ID? Without scanning the entire partition.

Something like this:

3条回答
  •  借酒劲吻你
    2020-12-05 11:52

    The following gives you a new RDD with elements that are the sizes of each partition:

    rdd.mapPartitions(iter => Array(iter.size).iterator, true) 
    

提交回复
热议问题