TL;DR: In a Spark Standalone cluster, what are the differences between client and cluster deploy modes? How do I set which mode my application is going to r
What are the practical differences between Spark Standalone client deploy mode and cluster deploy mode? What are the pro's and con's of using each one?
Let's try to look at the differences between client and cluster mode.
Client:
Cluster:
--supervise flag and be reset in case it dies.Which one is better? Not sure, that's actually for you to experiment and decide. This is no better decision here, you gain things from the former and latter, it's up to you to see which one works better for your use-case.
How to I choose which one my application is going to be running on, using
spark-submit
The way to choose which mode to run in is by using the --deploy-mode flag. From the Spark Configuration page:
/bin/spark-submit \
--class
--master \
--deploy-mode \
--conf = \
... # other options
\
[application-arguments]