Spark: produce RDD[(X, X)] of all possible combinations from RDD[X]

后端 未结 4 1250
刺人心
刺人心 2020-11-30 03:59

Is it possible in Spark to implement \'.combinations\' function from scala collections?

   /** Iterates over combinations.
   *
   *  @return   An Iterator w         


        
4条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-30 04:44

    This is supported natively by a Spark RDD with the cartesian transformation.

    e.g.:

    val rdd = sc.parallelize(1 to 5)
    val cartesian = rdd.cartesian(rdd)
    cartesian.collect
    
    Array[(Int, Int)] = Array((1,1), (1,2), (1,3), (1,4), (1,5), 
    (2,1), (2,2), (2,3), (2,4), (2,5), 
    (3,1), (3,2), (3,3), (3,4), (3,5), 
    (4,1), (4,2), (4,3), (4,4), (4,5), 
    (5,1), (5,2), (5,3), (5,4), (5,5))
    

提交回复
热议问题