I have a spark DF with rows of Seq[(String, String, String)]. I\'m trying to do some kind of a flatMap with that but anything I do try ends up thro
object ListSerdeTest extends App {
implicit val spark: SparkSession = SparkSession
.builder
.master("local[2]")
.getOrCreate()
import spark.implicits._
val myDS = spark.createDataset(
Seq(
MyCaseClass(mylist = Array(("asd", "aa"), ("dd", "ee")))
)
)
myDS.toDF().printSchema()
myDS.toDF().foreach(
row => {
row.getSeq[Row](row.fieldIndex("mylist"))
.foreach {
case Row(a, b) => println(a, b)
}
}
)
}
case class MyCaseClass (
mylist: Seq[(String, String)]
)
Above code is yet another way to deal with nested structure. Spark default Encoder will encode TupleX, making them nested struct, that's why you are seeing this strange behaviour. and like others said in the comment, you can't just do getAs[T]() since it's just a cast(x.asInstanceOf[T]), therefore will give you runtime exceptions.