Spark - Error “A master URL must be set in your configuration” when submitting an app

后端 未结 16 1949
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-02 07:31

I have an Spark app which runs with no problem in local mode,but have some problems when submitting to the Spark cluster.

The error msg are as follows:



        
相关标签:
16条回答
  • 2020-12-02 07:51

    The TLDR:

    .config("spark.master", "local")
    

    a list of the options for spark.master in spark 2.2.1

    I ended up on this page after trying to run a simple Spark SQL java program in local mode. To do this, I found that I could set spark.master using:

    SparkSession spark = SparkSession
    .builder()
    .appName("Java Spark SQL basic example")
    .config("spark.master", "local")
    .getOrCreate();
    

    An update to my answer:

    To be clear, this is not what you should do in a production environment. In a production environment, spark.master should be specified in one of a couple other places: either in $SPARK_HOME/conf/spark-defaults.conf (this is where cloudera manager will put it), or on the command line when you submit the app. (ex spark-submit --master yarn).

    If you specify spark.master to be 'local' in this way, spark will try to run in a single jvm, as indicated by the comments below. If you then try to specify --deploy-mode cluster, you will get an error 'Cluster deploy mode is not compatible with master "local"'. This is because setting spark.master=local means that you are NOT running in cluster mode.

    Instead, for a production app, within your main function (or in functions called by your main function), you should simply use:

    SparkSession
    .builder()
    .appName("Java Spark SQL basic example")
    .getOrCreate();
    

    This will use the configurations specified on the command line/in config files.

    Also, to be clear on this too: --master and "spark.master" are the exact same parameter, just specified in different ways. Setting spark.master in code, like in my answer above, will override attempts to set --master, and will override values in spark-defaults.conf, so don't do it in production. Its great for tests though.

    also, see this answer. which links to a list of the options for spark.master and what each one actually does.

    a list of the options for spark.master in spark 2.2.1

    0 讨论(0)
  • 2020-12-02 07:51

    just add .setMaster("local") to your code as shown below:

    val conf = new SparkConf().setAppName("Second").setMaster("local") 
    

    It worked for me ! Happy coding !

    0 讨论(0)
  • 2020-12-02 07:54

    I had the same problem, Here is my code before modification :

    package com.asagaama
    
    import org.apache.spark.SparkContext
    import org.apache.spark.SparkConf
    import org.apache.spark.rdd.RDD
    
    /**
      * Created by asagaama on 16/02/2017.
      */
    object Word {
    
      def countWords(sc: SparkContext) = {
        // Load our input data
        val input = sc.textFile("/Users/Documents/spark/testscase/test/test.txt")
        // Split it up into words
        val words = input.flatMap(line => line.split(" "))
        // Transform into pairs and count
        val counts = words.map(word => (word, 1)).reduceByKey { case (x, y) => x + y }
        // Save the word count back out to a text file, causing evaluation.
        counts.saveAsTextFile("/Users/Documents/spark/testscase/test/result.txt")
      }
    
      def main(args: Array[String]) = {
        val conf = new SparkConf().setAppName("wordCount")
        val sc = new SparkContext(conf)
        countWords(sc)
      }
    
    }
    

    And after replacing :

    val conf = new SparkConf().setAppName("wordCount")
    

    With :

    val conf = new SparkConf().setAppName("wordCount").setMaster("local[*]")
    

    It worked fine !

    0 讨论(0)
  • 2020-12-02 07:55

    We are missing the setMaster("local[*]") to set. Once we added then problem get resolved.

    Problem:

    val spark = SparkSession
          .builder()
          .appName("Spark Hive Example")
          .config("spark.sql.warehouse.dir", warehouseLocation)
          .enableHiveSupport()
          .getOrCreate()
    

    solution:

    val spark = SparkSession
          .builder()
          .appName("Spark Hive Example")
          .config("spark.sql.warehouse.dir", warehouseLocation)
          .enableHiveSupport()
          .master("local[*]")
          .getOrCreate()
    
    0 讨论(0)
  • 2020-12-02 07:57

    Where is the sparkContext object defined, is it inside the main function?

    I too faced the same problem, the mistake which i did was i initiated the sparkContext outside the main function and inside the class.

    When I initiated it inside the main function, it worked fine.

    0 讨论(0)
  • 2020-12-02 07:57

    If you are running a standalone application then you have to use SparkContext instead of SparkSession

    val conf = new SparkConf().setAppName("Samples").setMaster("local")
    val sc = new SparkContext(conf)
    val textData = sc.textFile("sample.txt").cache()
    
    0 讨论(0)
提交回复
热议问题