I have been of late trying out apache spark. My question is more specific to trigger spark jobs. Here I had posted question on understanding spark jobs. After getting dirty
Even I had this requirement I could do it using Livy Server, as one of the contributor Josemy mentioned. Following are the steps I took, hope it helps somebody:
Download livy zip from https://livy.apache.org/download/
Follow instructions: https://livy.apache.org/get-started/
Upload the zip to a client.
Unzip the file
Check for the following two parameters if doesn't exists, create with right path
export SPARK_HOME=/opt/spark
export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
Enable 8998 port on the client
Update $LIVY_HOME/conf/livy.conf with master details any other stuff needed
Note: Template are there in $LIVY_HOME/conf
Eg. livy.file.local-dir-whitelist = /home/folder-where-the-jar-will-be-kept/
Run the server
$LIVY_HOME/bin/livy-server start
Stop the server
$LIVY_HOME/bin/livy-server stop
UI: <client-ip>:8998/ui/
Submitting job:POST : http://<your client ip goes here>:8998/batches
{
"className" : "<ur class name will come here with package name>",
"file" : "your jar location",
"args" : ["arg1", "arg2", "arg3" ]
}
It turns out Spark has a hidden REST API to submit a job, check status and kill.
Check out full example here: http://arturmkrtchyan.com/apache-spark-hidden-rest-api
Livy is an open source REST interface for interacting with Apache Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in Apache Hadoop YARN.
Here is a good client that you might find helpful: https://github.com/ywilkof/spark-jobs-rest-client
Edit: this answer was given in 2015. There are options like Livy available now.
Just use the Spark JobServer https://github.com/spark-jobserver/spark-jobserver
There are a lot of things to consider with making a service, and the Spark JobServer has most of them covered already. If you find things that aren't good enough, it should be easy to make a request and add code to their system rather than reinventing it from scratch