How to Execute sql queries in Apache Spark

前端 未结 2 2035
难免孤独
难免孤独 2021-01-03 09:39

I am very new to Apache Spark.
I have already configured spark 2.0.2 on my local windows machine. I have done with "word count" example with spark.

相关标签:
2条回答
  • 2021-01-03 10:15

    So you need to do these things to get it done ,

    In Spark 2.0.2 we have SparkSession which contains SparkContext instance as well as sqlContext instance.

    Hence the steps would be :

    Step 1: Create SparkSession

    val spark = SparkSession.builder().appName("MyApp").master("local[*]").getOrCreate()
    

    Step 2: Load from the database in your case Mysql.

    val loadedData=spark
          .read
          .format("jdbc")
          .option("url", "jdbc:mysql://localhost:3306/mydatabase")
          .option("driver", "com.mysql.jdbc.Driver")
          .option("mytable", "mydatabase")
          .option("user", "root")
          .option("password", "toor")
          .load().createOrReplaceTempView("mytable")
    

    Step 3: Now you can run your SqlQuery just like you do in SqlDatabase.

    val dataFrame=spark.sql("Select * from mytable")
    dataFrame.show()
    

    P.S: It would be better if you use DataFrame Api's or even better if DataSet Api's , but for those you need to go through the documentation.

    Link to Documentation: https://spark.apache.org/docs/2.0.0/api/scala/index.html#org.apache.spark.sql.Dataset

    0 讨论(0)
  • 2021-01-03 10:15

    In Spark 2.x you no longer reference sqlContext, but rather spark, so you need to do:

    spark
      .read
      .format("jdbc")
      .option("url", "jdbc:mysql://localhost:3306/mydb")
      .option("driver", "com.mysql.jdbc.Driver")
      .option("dbtable", "mydb")
      .option("user", "root")
      .option("password", "")
      .load()
    
    0 讨论(0)
提交回复
热议问题