Spark SPARK_PUBLIC_DNS and SPARK_LOCAL_IP on stand-alone cluster with docker containers

后端 未结 3 466
陌清茗
陌清茗 2020-12-09 06:09

So far I have run Spark only on Linux machines and VMs (bridged networking) but now I am interesting on utilizing more computers as slaves. It would be handy to distribute a

3条回答
  •  生来不讨喜
    2020-12-09 06:33

    I think I found a solution for my use-case (one Spark container / host OS):

    1. Use --net host with docker run => host's eth0 is visible in the container
    2. Set SPARK_PUBLIC_DNS and SPARK_LOCAL_IP to host's ip, ignore the docker0's 172.x.x.x address

    Spark can bind to the host's ip and other machines communicate to it as well, port forwarding takes care of the rest. DNS or any complex configs were not needed, I haven't thoroughly tested this but so far so good.

    Edit: Note that these instructions are for Spark 1.x, at Spark 2.x only SPARK_PUBLIC_DNS is required, I think SPARK_LOCAL_IP is deprecated.

提交回复
热议问题