Spark Launcher waiting for job completion infinitely

后端 未结 3 406
执笔经年
执笔经年 2020-12-30 03:49

I am trying to submit a JAR with Spark job into the YARN cluster from Java code. I am using SparkLauncher to submit SparkPi example:

Process spark = new Spar         


        
相关标签:
3条回答
  • 2020-12-30 04:12

    Since this is an old post, i would like to add an update that might help whom ever read this post after. In spark 1.6.0 there are some added functions in SparkLauncher class. Which is:

    def startApplication(listeners: <repeated...>[Listener]): SparkAppHandle
    

    http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.launcher.SparkLauncher

    You can run the application with out the need for additional threads for the stdout and stderr handling plush there is a nice status reporting of the application running. Use this code:

      val env = Map(
          "HADOOP_CONF_DIR" -> hadoopConfDir,
          "YARN_CONF_DIR" -> yarnConfDir
        )
      val handler = new SparkLauncher(env.asJava)
          .setSparkHome(sparkHome)
          .setAppResource("Jar/location/.jar")
          .setMainClass("path.to.the.main.class")
          .setMaster("yarn-client")
          .setConf("spark.app.id", "AppID if you have one")
          .setConf("spark.driver.memory", "8g")
          .setConf("spark.akka.frameSize", "200")
          .setConf("spark.executor.memory", "2g")
          .setConf("spark.executor.instances", "32")
          .setConf("spark.executor.cores", "32")
          .setConf("spark.default.parallelism", "100")
          .setConf("spark.driver.allowMultipleContexts","true")
          .setVerbose(true)
          .startApplication()
    println(handle.getAppId)
    println(handle.getState)
    

    You can keep enquering the state if the spark application until it give success. For information about how the Spark Launcher server works in 1.6.0. see this link: https://github.com/apache/spark/blob/v1.6.0/launcher/src/main/java/org/apache/spark/launcher/LauncherServer.java

    0 讨论(0)
  • 2020-12-30 04:17

    I got help in the Spark mailing list. The key is to read / clear getInputStream and getErrorStream() on the Process. The child process might fill up the buffer and cause a deadlock - see Oracle docs regarding Process. The streams should be read in separate threads:

    Process spark = new SparkLauncher()
        .setSparkHome("C:\\spark-1.4.1-bin-hadoop2.6")
        .setAppResource("C:\\spark-1.4.1-bin-hadoop2.6\\lib\\spark-examples-1.4.1-hadoop2.6.0.jar")
        .setMainClass("org.apache.spark.examples.SparkPi").setMaster("yarn-cluster").launch();
    
    InputStreamReaderRunnable inputStreamReaderRunnable = new InputStreamReaderRunnable(spark.getInputStream(), "input");
    Thread inputThread = new Thread(inputStreamReaderRunnable, "LogStreamReader input");
    inputThread.start();
    
    InputStreamReaderRunnable errorStreamReaderRunnable = new InputStreamReaderRunnable(spark.getErrorStream(), "error");
    Thread errorThread = new Thread(errorStreamReaderRunnable, "LogStreamReader error");
    errorThread.start();
    
    System.out.println("Waiting for finish...");
    int exitCode = spark.waitFor();
    System.out.println("Finished! Exit code:" + exitCode);
    

    where InputStreamReaderRunnable class is:

    public class InputStreamReaderRunnable implements Runnable {
    
        private BufferedReader reader;
    
        private String name;
    
        public InputStreamReaderRunnable(InputStream is, String name) {
            this.reader = new BufferedReader(new InputStreamReader(is));
            this.name = name;
        }
    
        public void run() {
            System.out.println("InputStream " + name + ":");
            try {
                String line = reader.readLine();
                while (line != null) {
                    System.out.println(line);
                    line = reader.readLine();
                }
                reader.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-30 04:29

    I implemented using CountDownLatch, and it works as expected. This is for SparkLauncher version 2.0.1 and it works in Yarn-cluster mode too.

        ...
    final CountDownLatch countDownLatch = new CountDownLatch(1);
    SparkAppListener sparkAppListener = new SparkAppListener(countDownLatch);
    SparkAppHandle appHandle = sparkLauncher.startApplication(sparkAppListener);
    Thread sparkAppListenerThread = new Thread(sparkAppListener);
    sparkAppListenerThread.start();
    long timeout = 120;
    countDownLatch.await(timeout, TimeUnit.SECONDS);    
        ...
    
    private static class SparkAppListener implements SparkAppHandle.Listener, Runnable {
        private static final Log log = LogFactory.getLog(SparkAppListener.class);
        private final CountDownLatch countDownLatch;
        public SparkAppListener(CountDownLatch countDownLatch) {
            this.countDownLatch = countDownLatch;
        }
        @Override
        public void stateChanged(SparkAppHandle handle) {
            String sparkAppId = handle.getAppId();
            State appState = handle.getState();
            if (sparkAppId != null) {
                log.info("Spark job with app id: " + sparkAppId + ",\t State changed to: " + appState + " - "
                        + SPARK_STATE_MSG.get(appState));
            } else {
                log.info("Spark job's state changed to: " + appState + " - " + SPARK_STATE_MSG.get(appState));
            }
            if (appState != null && appState.isFinal()) {
                countDownLatch.countDown();
            }
        }
        @Override
        public void infoChanged(SparkAppHandle handle) {}
        @Override
        public void run() {}
    }
    
    0 讨论(0)
提交回复
热议问题