How to flatten list inside RDD?

后端 未结 3 1607
你的背包
你的背包 2020-12-30 05:07

Is it possible to flatten list inside RDD? For example convert:

 val xxx: org.apache.spark.rdd.RDD[List[Foo]]

to:

 val yyy:         


        
3条回答
  •  攒了一身酷
    2020-12-30 05:26

    You could pimp the RDD class to attach a .flatten method (in order to follow the List api):

    object SparkHelper {
      implicit class SeqRDDExtensions[T: ClassTag](val rdd: RDD[Seq[T]]) {
        def flatten: RDD[T] = rdd.flatMap(identity)
      }
    }
    

    which can then simply be used as such:

    rdd.flatten
    

提交回复
热议问题