I\'ve been trying to find a reasonable way to test SparkSession
with the JUnit testing framework. While there seem to be good examples for SparkContext
Since Spark 1.6 you could use SharedSparkContext or SharedSQLContext that Spark uses for its own unit tests:
class YourAppTest extends SharedSQLContext {
var app: YourApp = _
protected override def beforeAll(): Unit = {
super.beforeAll()
app = new YourApp
}
protected override def afterAll(): Unit = {
super.afterAll()
}
test("Your test") {
val df = sqlContext.read.json("examples/src/main/resources/people.json")
app.run(df)
}
Since Spark 2.3 SharedSparkSession is available:
class YourAppTest extends SharedSparkSession {
var app: YourApp = _
protected override def beforeAll(): Unit = {
super.beforeAll()
app = new YourApp
}
protected override def afterAll(): Unit = {
super.afterAll()
}
test("Your test") {
df = spark.read.json("examples/src/main/resources/people.json")
app.run(df)
}
UPDATE:
Maven dependency:
org.scalactic
scalactic
SCALATEST_VERSION
org.scalatest
scalatest
SCALATEST_VERSION
test
org.apache.spark
spark-core
SPARK_VERSION
test-jar
test
org.apache.spark
spark-sql
SPARK_VERSION
test-jar
test
SBT dependency:
"org.scalactic" %% "scalactic" % SCALATEST_VERSION
"org.scalatest" %% "scalatest" % SCALATEST_VERSION % "test"
"org.apache.spark" %% "spark-core" % SPARK_VERSION % Test classifier "tests"
"org.apache.spark" %% "spark-sql" % SPARK_VERSION % Test classifier "tests"
In addition, you could check test sources of Spark where there is a huge set of various test suits.
UPDATE 2:
Apache Spark Unit Testing Part 1 — Core Components
Apache Spark Unit Testing Part 2 — Spark SQL
Apache Spark Unit Testing Part 3 — Streaming
Apache Spark Integration Testing