Difference between SparkContext, JavaSparkContext, SQLContext, and SparkSession?

前端 未结 4 757
[愿得一人]
[愿得一人] 2020-12-04 10:19
  1. What is the difference between SparkContext, JavaSparkContext, SQLContext and SparkSession?
  2. Is there any m
4条回答
  •  失恋的感觉
    2020-12-04 11:07

    I will talk about Spark version 2.x only.

    SparkSession: It's a main entry point of your spark Application. To run any code on your spark, this is the first thing you should create.

    from pyspark.sql import SparkSession
    spark = SparkSession.builder.master("local").appName("Word Count")\
    .config("spark.some.config.option", "some-value")\
    .getOrCreate()
    

    SparkContext: It's a inner Object (property) of SparkSession. It's used to interact with Low-Level API Through SparkContext you can create RDD, accumlator and Broadcast variables.

    for most cases you won't need SparkContext. You can get SparkContext from SparkSession

    val sc = spark.sparkContext
    

提交回复
热议问题