How to submit a job via REST API?

瘦欲@ 提交于 2020-01-01 06:55:02

问题


I'm using Datastax Enterprise 4.8.3. I'm trying to implement a Quartz based application to remotely submit Spark jobs. During my research I have stumbled upon the following links:

  1. Apache Spark Hidden REST API
  2. Spark feature - Provide a stable application submission gateway in standalone cluster mode

To test out the theory I tried executing the below code snippet on the master node (IP: "spark-master-ip"; directly on the shell) of my 2 node cluster (as provided in link #1 above):

curl -X POST http://spark-master-ip:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
    "action" : "CreateSubmissionRequest",
    "appArgs" : [ "myAppArgument1" ],
    "appResource" : "file:/home/local/sparkjob.jar",
    "clientSparkVersion" : "1.4.2",
    "environmentVariables" : {
    "SPARK_ENV_LOADED" : "1"
  },
  "mainClass" : "com.spark.job.Launcher",
  "sparkProperties" : {
      "spark.jars" : "file:/home/local/sparkjob.jar",
      "spark.driver.supervise" : "false",
      "spark.app.name" : "MyJob",
      "spark.eventLog.enabled": "true",
      "spark.submit.deployMode" : "cluster",
      "spark.master" : "spark://spark-master-ip:6066"
  }
}'

But executing the code I get an html response with the following text:

This Page Cannot Be Displayed
The system cannot communicate with the external server (spark-master-ip).
The Internet server may be busy, may be permanently down, or may be unreachable because of network problems.
Please check the spelling of the Internet address entered.
If it is correct, try this request later.

If you have questions, please contact your organization's network administrator and provide the codes shown below.

Date: Fri, 11 Dec 2015 13:19:15 GMT
Username: 
Source IP: spark-master-ip
URL: POST http://spark-master-ip/v1/submissions/create
Category: Uncategorized URLs
Reason: UNKNOWN
Notification: GATEWAY_TIMEOUT

回答1:


  • Check that you have started a Spark master and worker (using start-all.sh)

  • Check that in the log file there is a message like

INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066

  • Check the started process is really listening on port 6066 (using netstat)

It should look like this:

root@x:~# netstat -apn | grep 11572 | grep LISTEN
tcp6       0      0 :::8080                 :::*                    LISTEN      11572/java      
tcp6       0      0 10.0.0.9:6066           :::*                    LISTEN      11572/java      
tcp6       0      0 10.0.0.9:7077           :::*                    LISTEN      11572/java      

Then replace "spark-master-ip" in the script with the IP address you see in the output of netstat (the example shows "10.0.0.9").




回答2:


Using Spark 2.4.3, we found that the REST API is disabled by default. When the REST API is disabled, calls to port 6066 will fail with the error you have shown.

We found that the REST API had to be enabled by adding the following entry to your spark-defaults.conf file.

spark.master.rest.enabled true

Once this entry was added, we restarted the Spark instance on the machine and the REST API came to life.



来源:https://stackoverflow.com/questions/34225879/how-to-submit-a-job-via-rest-api

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!