spark RDD基础装换操作--randomSplit操作
12.randomSplit操作 将由数字1~10组成的RDD,用randomSplit操作拆分成3个RDD。 scala> val rddData1 = sc.parallelize(1 to 10,3) rddData1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[28] at parallelize at <console>:24 scala> val splitRDD = rddData1.randomSplit(Array(1,4,5)) splitRDD: Array[org.apache.spark.rdd.RDD[Int]] = Array(MapPartitionsRDD[29] at randomSplit at <console>:26, MapPartitionsRDD[30] at randomSplit at <console>:26, MapPartitionsRDD[31] at randomSplit at <console>:26) scala> splitRDD(0).collect res9: Array[Int] = Array(3) scala> splitRDD(1).collect res10: Array[Int] = Array(2, 7) scala>