Spark SPARK_PUBLIC_DNS and SPARK_LOCAL_IP on stand-alone cluster with docker containers

后端 未结 3 457
陌清茗
陌清茗 2020-12-09 06:09

So far I have run Spark only on Linux machines and VMs (bridged networking) but now I am interesting on utilizing more computers as slaves. It would be handy to distribute a

3条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-12-09 06:21

    I'm running 3 different types of docker containers on my machine with the intention of deploying them into the cloud when all the software we need are added to them: Master, Worker and Jupyter notebook (with Scala, R and Python kernels).

    Here are my observations so far:

    Master:

    • I couldn't make it bind to the Docker Host IP. Instead, I pass in a made up domain name to it: -h "dockerhost-master" -e SPARK_MASTER_IP="dockerhost-master". I couldn't find a way to make Akka bind against the container's IP and but accept messages against the host IP. I know it's possible with Akka 2.4, but maybe not with Spark.
    • I'm passing in -e SPARK_LOCAL_IP="${HOST_IP}" which causes the Web UI to bind against that address instead of the container's IP, but the Web UI works all right either way.

    Worker:

    • I gave the worker container a different hostname and pass it as --host to the Spark org.apache.spark.deploy.master.Worker class. It can't be the same as the master's or the Akka cluster will not work: -h "dockerhost-worker"
    • I'm using Docker's add-host so the container is able to resolve the hostname to the master's IP: --add-host dockerhost-master:${HOST_IP}
    • The master URL that needs to be passed is spark://dockerhost-master:7077

    Jupyter:

    • This one needs the master URL and add-host to be able to resolve it
    • The SparkContext lives in the notebook and that's where the web UI of the Spark Application is started, not the master. By default it binds to the internal IP address of the Docker container. To change that I had to pass in: -e SPARK_PUBLIC_DNS="${VM_IP}" -p 4040:4040. Subsequent applications from the notebook would be on 4041, 4042, etc.

    With these settings the three components are able to communicate with each other. I'm using custom startup scripts with spark-class to launch the classes in the foreground and keep the Docker containers from quitting at the moment.

    There are a few other ports that could be exposed such as the history server which I haven't encountered yet. Using --net host seems much simpler.

提交回复
热议问题