Select case class based on String in Scala

佐手、 提交于 2020-12-06 13:23:34

问题


How can I select a case class based on a String value?

My code is

val spark = SparkSession.builder()...
val rddOfJsonStrings: RDD[String] = // some json strings as RDD
val classSelector: String = ??? // could be "Foo" or "Bar", or any other String value
case class Foo(foo: String)
case class Bar(bar: String)


if (classSelector == "Foo") {
  val df: DataFrame = spark.read.json(rddOfJsonStrings)
  df.as[Foo]
} else if (classSelector == "Bar") {
  val df: DataFrame = spark.read.json(rddOfJsonStrings)
  df.as[Bar]
} else {
  throw ClassUnknownException //custom Exception
}

The variable classSeletector is a simple String that should be used to point to the case class of the same name.

Imagine I don't only have Foo and Bar as case classes but more then those two. How is it possible to call the df.as[] statement based on the String (if possible at all)?

Or is there a completely different approach available in Scala?


回答1:


How is it possible to call the df.as[] statement based on the String (if possible at all)?

It isn't (or based on any runtime value). You may note that all answers still need to:

  1. have a separate branch for Foo and Bar (and one more branch for each class you'll want to add);

  2. repeat the class name twice in the branch.

You can avoid the second:

import scala.reflect.{classTag, ClassTag}

val df: DataFrame = spark.read.json(rddOfJsonStrings)
// local function defined where df and classSelector are visible
def dfAsOption[T : Encoder : ClassTag] =
  Option.when(classSelector == classTag[T].runtimeClass.simpleName)(df.as[T])

dfAsOption[Foo].dfAsOption(asOption[Bar]).getOrElse(throw ClassUnknownException)

But for the first you'd need a macro if it's possible at all. I would guess it isn't.




回答2:


Check below code

classSeletector match {
    case c if Foo.getClass.getSimpleName.replace("$","").equalsIgnoreCase(c) =>  spark.read.json(rddOfJsonStrings).as[Foo]
    case c if Bar.getClass.getSimpleName.replace("$","").equalsIgnoreCase(c) =>  spark.read.json(rddOfJsonStrings).as[Bar]
    case _ => throw ClassUnknownException //custom Exception
}




回答3:


Define a generic method and invoke it,


getDs[Foo](spark,rddOfJsonStrings)
getDs[Bar](spark,rddOfJsonStrings)

def getDs[T](spark : SparkSession, rddOfJsonStrings:String)  {
    spark.read.json(rddOfJsonStrings).as[T](Encoders.bean[T](classOf[T]))
  }



回答4:


Alternative-

highlights-

  1. Use simpleName of the case class and not of the companion object
  2. if classSelector is null, the solution won't fail
case class Foo(foo: String)
case class Bar(bar: String)

Testcase-

 val rddOfJsonStrings: RDD[String] = spark.sparkContext.parallelize(Seq("""{"foo":1}"""))
    val classSelector: String = "Foo" // could be "Foo" or "Bar", or any other String value

    val ds = classSelector match {
      case foo if classOf[Foo].getSimpleName == foo =>
        val df: DataFrame = spark.read.json(rddOfJsonStrings)
        df.as[Foo]
      case bar if classOf[Bar].getSimpleName == bar =>
        val df: DataFrame = spark.read.json(rddOfJsonStrings)
        df.as[Bar]
      case _ => throw new UnsupportedOperationException
    }

    ds.show(false)

    /**
      * +---+
      * |foo|
      * +---+
      * |1  |
      * +---+
      */


来源:https://stackoverflow.com/questions/62283312/select-case-class-based-on-string-in-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!