This is in Spark 1.6.x. I\'m looking for a workaround.
I have a function that creates a DataFrame from a DataFrame\'s underlying RDD:
This happens because you create a new SQLContext in your function. Since temporary tables are limited in scope to its parent context, there cannot be accessed from another one.
df2.sqlContext.sql("SELECT * FROM df2")
To solve this, pass existing SQLContext in place of SparkContext:
def rddAndBack(sqlContext: org.apache.spark.sql.SQLContext, df: DataFrame) = {
sqlContext.createDataFrame(df.rdd, df.schema)
}
or use getOrCreate factory method:
def rddAndBack(sc: SparkContext, df: DataFrame) : DataFrame = {
val sqlContext = org.apache.spark.sql.SQLContext.getOrCreate(sc)
sqlContext.createDataFrame(df.rdd, df.schema)
}
or use SQLContext instance bound to the input df:
def rddAndBack(sc: SparkContext, df: DataFrame) : DataFrame = {
val sqlContext = df.sqlContext
sqlContext.createDataFrame(df.rdd, df.schema)
}