问题
If i write a spark program and run it in stand alone mode and when I want to deploy it in a cluster, do I have to change my program codes or no change in codes needed? Is spark programming independent of number of clusters?
回答1:
I don't think you need to make any changes. Your program should run the same way as it run in local mode.
Yes, Spark programs are independent of clusters, until and unless you are using something specific to cluster. Normally this is managed by the YARN.
回答2:
You just need to set option master to yarn or other resource manager, when you want to run it on cluster.
If you want to run it locally just use local[*] by using the number of threads which equal to your machine cores.
来源:https://stackoverflow.com/questions/54409149/spark-program-difference-in-local-mode-and-cluster