How to make it easier to deploy my Jar to Spark Cluster in standalone mode?

后端 未结 1 476
时光取名叫无心
时光取名叫无心 2020-12-31 14:53

I have a small cluster with 3 machines, and another machine for developing and testing. When developing, I set SparkContext to local. When everythi

1条回答
  •  死守一世寂寞
    2020-12-31 15:26

    In Spark, the program creating the SparkContext is called 'the driver'. It's sufficient that the jar file with your job is available to the local file system of the driver in order for it to pick it up and ship it to the master/workers.

    In concrete, your config will look like:

    //favor using Spark Conf to configure your Spark Context
    val conf = new SparkConf()
                 .setMaster("spark://mymaster:7077")
                 .setAppName("SimpleApp")
                 .set("spark.local.ip", "172.17.0.1")
                 .setJars(Array("/local/dir/SimplyApp.jar"))
    
    val sc = new SparkContext(conf)
    

    Under the hood, the driver will start a server where the workers will download the jar file(s) from the driver. It's therefore important (and often an issue) that the workers have network access to the driver. This can often be ensured by setting 'spark.local.ip' on the driver in a network that's accessible/routable from the workers.

    0 讨论(0)
提交回复
热议问题