I have a org.apache.spark.mllib.linalg.Vector RDD that [Int Int Int] . I am trying to convert this into a dataframe using this code
import sqlContext.implicits._ import org.apache.spark.sql.types.StructType import org.apache.spark.sql.types.StructField import org.apache.spark.sql.types.DataTypes import org.apache.spark.sql.types.ArrayData vectrdd belongs to the type org.apache.spark.mllib.linalg.Vector
val vectarr = vectrdd.toArray() case class RFM(Recency: Integer, Frequency: Integer, Monetary: Integer) val df = vectarr.map { case Array(p0, p1, p2) => RFM(p0, p1, p2) }.toDF() I am getting the following error
warning: fruitless type test: a value of type org.apache.spark.mllib.linalg.Vector cannot also be a Array[T] val df = vectarr.map { case Array(p0, p1, p2) => RFM(p0, p1, p2) }.toDF() error: pattern type is incompatible with expected type; found : Array[T] required: org.apache.spark.mllib.linalg.Vector val df = vectarr.map { case Array(p0, p1, p2) => RFM(p0, p1, p2) }.toDF() The second method i tried is this
val vectarr=vectrdd.toArray().take(2) case class RFM(Recency: String, Frequency: String, Monetary: String) val df = vectrdd.map { case (t0, t1, t2) => RFM(p0, p1, p2) }.toDF() I got this error
error: constructor cannot be instantiated to expected type; found : (T1, T2, T3) required: org.apache.spark.mllib.linalg.Vector val df = vectrdd.map { case (t0, t1, t2) => RFM(p0, p1, p2) }.toDF() I used this example as a guide >> Convert RDD to Dataframe in Spark/Scala