How to configure Apache Spark random worker ports for tight firewalls?

前端未结

关注

 2  849

迷失自我 2020-12-08 23:35

I am using Apache Spark to run machine learning algorithms and other big data tasks. Previously, I was using spark cluster standalone mode running spark master and worker on

2条回答

南笙 (楼主)

2020-12-09 00:18
Update for Spark 2.x

Some libraries have been rewritten from scratch and many legacy *.port properties are now obsolete (cf. SPARK-10997 / SPARK-20605 / SPARK-12588 / SPARK-17678 / etc)

For Spark 2.1, for instance, the port ranges on which the driver will listen for executor traffic are
- between spark.driver.port and spark.driver.port+spark.port.maxRetries
- between spark.driver.blockManager.port and spark.driver.blockManager.port+spark.port.maxRetries
And the port range on which the executors will listen for driver traffic and/or other executors traffic is
- between spark.blockManager.port and spark.blockManager.port+spark.port.maxRetries
The "maxRetries" property allows for running several Spark jobs in parallel; if the base port is already used, then the new job will try the next one, etc, unless the whole range is already used.

Source:
https://spark.apache.org/docs/2.1.1/configuration.html#networking
https://spark.apache.org/docs/2.1.1/security.html under "Configuring ports"
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...