How do i setup Pyspark in Python 3 with spark-env.sh.template

这一生的挚爱 提交于 2020-01-01 07:13:11

问题


Because i have this issue in my ipython3 notebook, i guess i have to change "spark-env.sh.template" somehow.

Exception: Python in worker has different version 2.7 than that in driver 3.4, PySpark cannot run with different minor versions


回答1:


Spark does not yet work with Python 3.If you wish to use the Python API you will also need a Python interpreter (version 2.6 or newer).

I had the same issue when running IPYTHON=1 ./pyspark.

Ok quick fix

Edit vim pyspark and change PYSPARK_DRIVER_PYTHON="ipython" line to

PYSPARK_DRIVER_PYTHON="ipython2"

That's it.

If you want to check where dose ipython points to,

Type which ipython in terminal and I bet that'll be

/Library/Frameworks/Python.framework/Versions/3.4/bin/ipython

**UPDATED**

The latest version of spark works well with python 3. So this may not need with the latest version.

Just set the environment variable:

export PYSPARK_PYTHON=python3

in case you want this change to be permanent add this line to pyspark script




回答2:


I believe you can specify the two separately, like so:

PYSPARK_PYTHON=/opt/anaconda/bin/ipython
PYSPARK_DRIVER_PYTHON=/opt/anaconda/bin/ipython

Based on this other question Apache Spark: How to use pyspark with Python 3.



来源:https://stackoverflow.com/questions/30940631/how-do-i-setup-pyspark-in-python-3-with-spark-env-sh-template

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!