问题
How can I select a case class based on a String value?
My code is
val spark = SparkSession.builder()...
val rddOfJsonStrings: RDD[String] = // some json strings as RDD
val classSelector: String = ??? // could be "Foo" or "Bar", or any other String value
case class Foo(foo: String)
case class Bar(bar: String)
if (classSelector == "Foo") {
val df: DataFrame = spark.read.json(rddOfJsonStrings)
df.as[Foo]
} else if (classSelector == "Bar") {
val df: DataFrame = spark.read.json(rddOfJsonStrings)
df.as[Bar]
} else {
throw ClassUnknownException //custom Exception
}
The variable classSeletector is a simple String that should be used to point to the case class of the same name.
Imagine I don't only have Foo and Bar as case classes but more then those two. How is it possible to call the df.as[] statement based on the String (if possible at all)?
Or is there a completely different approach available in Scala?
回答1:
How is it possible to call the df.as[] statement based on the String (if possible at all)?
It isn't (or based on any runtime value). You may note that all answers still need to:
have a separate branch for
FooandBar(and one more branch for each class you'll want to add);repeat the class name twice in the branch.
You can avoid the second:
import scala.reflect.{classTag, ClassTag}
val df: DataFrame = spark.read.json(rddOfJsonStrings)
// local function defined where df and classSelector are visible
def dfAsOption[T : Encoder : ClassTag] =
Option.when(classSelector == classTag[T].runtimeClass.simpleName)(df.as[T])
dfAsOption[Foo].dfAsOption(asOption[Bar]).getOrElse(throw ClassUnknownException)
But for the first you'd need a macro if it's possible at all. I would guess it isn't.
回答2:
Check below code
classSeletector match {
case c if Foo.getClass.getSimpleName.replace("$","").equalsIgnoreCase(c) => spark.read.json(rddOfJsonStrings).as[Foo]
case c if Bar.getClass.getSimpleName.replace("$","").equalsIgnoreCase(c) => spark.read.json(rddOfJsonStrings).as[Bar]
case _ => throw ClassUnknownException //custom Exception
}
回答3:
Define a generic method and invoke it,
getDs[Foo](spark,rddOfJsonStrings)
getDs[Bar](spark,rddOfJsonStrings)
def getDs[T](spark : SparkSession, rddOfJsonStrings:String) {
spark.read.json(rddOfJsonStrings).as[T](Encoders.bean[T](classOf[T]))
}
回答4:
Alternative-
highlights-
- Use
simpleNameof the case class and not of the companion object - if
classSelectorisnull, the solution won't fail
case class Foo(foo: String)
case class Bar(bar: String)
Testcase-
val rddOfJsonStrings: RDD[String] = spark.sparkContext.parallelize(Seq("""{"foo":1}"""))
val classSelector: String = "Foo" // could be "Foo" or "Bar", or any other String value
val ds = classSelector match {
case foo if classOf[Foo].getSimpleName == foo =>
val df: DataFrame = spark.read.json(rddOfJsonStrings)
df.as[Foo]
case bar if classOf[Bar].getSimpleName == bar =>
val df: DataFrame = spark.read.json(rddOfJsonStrings)
df.as[Bar]
case _ => throw new UnsupportedOperationException
}
ds.show(false)
/**
* +---+
* |foo|
* +---+
* |1 |
* +---+
*/
来源:https://stackoverflow.com/questions/62283312/select-case-class-based-on-string-in-scala