I have an RDD with a tuple of values (String, SparseVector) and I want to create a DataFrame using the RDD. To get a (labe
this is an example in scala for spark 2.1
import org.apache.spark.ml.linalg.Vector
def featuresRDD2DataFrame(features: RDD[Vector]): DataFrame = {
import sparkSession.implicits._
val rdd: RDD[(Double, Vector)] = features.map(x => (0.0, x))
val df = rdd.toDF("label","features").select("features")
df
}
the toDF() was not recognized by the compiler on the features rdd