run spark as java web application

北慕城南 提交于 2019-11-30 15:30:57

Spark has REST apis to submit jobs by invoking spark master hostname.

Submit an Application:

curl -X POST http://spark-cluster-ip:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
  "action" : "CreateSubmissionRequest",
  "appArgs" : [ "myAppArgument1" ],
  "appResource" : "file:/myfilepath/spark-job-1.0.jar",
  "clientSparkVersion" : "1.5.0",
  "environmentVariables" : {
    "SPARK_ENV_LOADED" : "1"
  },
  "mainClass" : "com.mycompany.MyJob",
  "sparkProperties" : {
    "spark.jars" : "file:/myfilepath/spark-job-1.0.jar",
    "spark.driver.supervise" : "false",
    "spark.app.name" : "MyJob",
    "spark.eventLog.enabled": "true",
    "spark.submit.deployMode" : "cluster",
    "spark.master" : "spark://spark-cluster-ip:6066"
  }
}'

Submission Response:

{
  "action" : "CreateSubmissionResponse",
  "message" : "Driver successfully submitted as driver-20151008145126-0000",
  "serverSparkVersion" : "1.5.0",
  "submissionId" : "driver-20151008145126-0000",
  "success" : true
}

Get the status of a submitted application

curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20151008145126-0000

Status Response

{
  "action" : "SubmissionStatusResponse",
  "driverState" : "FINISHED",
  "serverSparkVersion" : "1.5.0",
  "submissionId" : "driver-20151008145126-0000",
  "success" : true,
  "workerHostPort" : "192.168.3.153:46894",
  "workerId" : "worker-20151007093409-192.168.3.153-46894"
}

Now in the spark application which you submit should perform all the operations and save output to any datasource and access the data via thrift server as don't have much data to transfer(you can think of sqoop if you want to transfer data between your MVC app db and Hadoop cluster).

credits: link1, link2

Edit: (as per question in comment) build spark application jar with necessary dependencies and run the job in local mode. Write the jar in way to read the CSV and make use of MLib then store the prediction output in some data source to access it from web app.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!