How to create SparkSession from existing SparkContext

前端 未结 6 997
时光取名叫无心
时光取名叫无心 2020-12-30 20:47

I have a Spark application which using Spark 2.0 new API with SparkSession. I am building this application on top of the another application which is using

相关标签:
6条回答
  • 2020-12-30 21:08

    You would have noticed that we are using SparkSession and SparkContext, and this is not an error. Let's revisit the annals of Spark history for a perspective. It is important to understand where we came from, as you will hear about these connection objects for some time to come.

    Prior to Spark 2.0.0, the three main connection objects were SparkContext, SqlContext, and HiveContext. The SparkContext object was the connection to a Spark execution environment and created RDDs and others, SQLContext worked with SparkSQL in the background of SparkContext, and HiveContext interacted with the Hive stores.

    Spark 2.0.0 introduced Datasets/DataFrames as the main distributed data abstraction interface and the SparkSession object as the entry point to a Spark execution environment. Appropriately, the SparkSession object is found in the namespace, org.apache.spark.sql.SparkSession (Scala), or pyspark.sql.sparkSession. A few points to note are as follows:

    In Scala and Java, Datasets form the main data abstraction as typed data; however, for Python and R (which do not have compile time type checking), the data...

    https://www.packtpub.com/mapt/book/big_data_and_business_intelligence/9781785889271/4/ch04lvl1sec31/sparksession-versus-sparkcontext

    0 讨论(0)
  • 2020-12-30 21:10

    Like in the above example you cannot create because SparkSession's constructor is private Instead you can create a SQLContext using the SparkContext, and later get the sparksession from the sqlcontext like this

    val sqlContext=new SQLContext(sparkContext);
    val spark=sqlContext.sparkSession
    

    Hope this helps

    0 讨论(0)
  • 2020-12-30 21:13

    Apparently there is no way how to initialize SparkSession from existing SparkContext.

    0 讨论(0)
  • 2020-12-30 21:21

    Deriving the SparkSession object out of SparkContext or even SparkConf is easy. Just that you might find the API to be slightly convoluted. Here's an example (I'm using Spark 2.4 but this should work in the older 2.x releases as well):

    // If you already have SparkContext stored in `sc`
    val spark = SparkSession.builder.config(sc.getConf).getOrCreate()
    
    // Another example which builds a SparkConf, SparkContext and SparkSession
    val conf = new SparkConf().setAppName("spark-test").setMaster("local[2]")
    val sc = new SparkContext(conf)
    val spark = SparkSession.builder.config(sc.getConf).getOrCreate()
    

    Hope that helps!

    0 讨论(0)
  • 2020-12-30 21:22
    val sparkSession = SparkSession.builder.config(sc.getConf).getOrCreate()
    
    0 讨论(0)
  • 2020-12-30 21:32
    public JavaSparkContext getSparkContext() 
    {
            SparkConf conf = new SparkConf()
                        .setAppName("appName")
                        .setMaster("local[*]");
            JavaSparkContext jsc = new JavaSparkContext(conf);
            return jsc;
    }
    
    
    public  SparkSession getSparkSession()
    {
            sparkSession= new SparkSession(getSparkContext().sc());
            return sparkSession;
    }
    
    
    you can also try using builder  
    
    public SparkSession getSparkSession()
    {
            SparkConf conf = new SparkConf()
                            .setAppName("appName")
                            .setMaster("local");
    
           SparkSession sparkSession = SparkSession
                                       .builder()
                                       .config(conf)
                                      .getOrCreate();
            return sparkSession;
    }
    
    0 讨论(0)
提交回复
热议问题