How to set port for pyspark jupyter notebook?

痴心易碎 提交于 2020-04-30 09:59:22

问题


I am starting a pyspark jupyter notebook with a script:

#!/bin/bash
ipaddres=...
echo "Start notebook server at IP address $ipaddress"

function snotebook ()
{
#Spark path (based on your computer)
SPARK_PATH=/home/.../software/spark-2.3.1-bin-hadoop2.7

export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"

# For python 3 users, you have to add the line below or you will get an error
export PYSPARK_PYTHON=python3

$SPARK_PATH/bin/pyspark --master local[10]
}

snotebook --no-browser --ip $ipaddress --certfile=/home/.../local/mycert.pem --keyfile /home/.../local/mykey.key  

I wonder how to set the port. Is there an environment variable that I can set? I would like to determine the port before the notebook starts. I tried --port 7999.


回答1:


If you mean Spark UI ports, in the spark-env.sh, it lists these two environment variables that you can overwrite, or set in that file

# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker

I'm not sure the Jupyter values or if PySpark even passes them through, but if jupyter notebook --port works on its own, then I would try

export PYSPARK_DRIVER_PYTHON_OPTS="notebook --port=7999"

If you want to pass all the argument from snotebook into the variable, then you need

export PYSPARK_DRIVER_PYTHON_OPTS="notebook $@"


来源:https://stackoverflow.com/questions/53754061/how-to-set-port-for-pyspark-jupyter-notebook

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!