Once I have got in Spark some Row class, either Dataframe or Catalyst, I want to convert it to a case class in my code. This can be done by matching
someRow
scala> import spark.implicits._
scala> val df = Seq((1, "james"), (2, "tony")).toDF("id", "name")
df: org.apache.spark.sql.DataFrame = [id: int, name: string]
scala> case class Student(id: Int, name: String)
defined class Student
scala> df.as[Student].collectAsList
res6: java.util.List[Student] = [Student(1,james), Student(2,tony)]
Here the spark
in spark.implicits._
is your SparkSession
. If you are inside the REPL the session is already defined as spark
otherwise you need to adjust the name accordingly to correspond to your SparkSession
.