Spark Installation and Configuration on MacOS ImportError: No module named pyspark

 ̄綄美尐妖づ 提交于 2019-12-25 07:46:22

问题


I'm trying to configure apache-spark on MacOS. All the online guides ask to either download the spark tar and set up some env variables or to use brew install apache-spark and then setup some env variables.

Now I installed apache-spark using brew install apache-spark. I run pyspark in terminal and I am getting a python prompt which suggests that the installation was successful.

Now when I try to do import pyspark into my python file, I'm facing error saying ImportError: No module named pyspark

The strangest thing I'm not able to understand is how is it able to start an REPL of pyspark and not able to import the module into python code.

I also tried doing pip install pyspark but it does not recognize the module either.

In addition to installing apache-spark with homebrew, I've set up following env variables.

if which java > /dev/null; then export JAVA_HOME=$(/usr/libexec/java_home); fi

if which pyspark > /dev/null; then
  export SPARK_HOME="/usr/local/Cellar/apache-spark/2.1.0/libexec/"
  export PYSPARK_SUBMIT_ARGS="--master local[2]"
fi

Please suggest what exactly is missing on my setup to run pyspark code on my local machine.


回答1:


sorry I dont use MAC , but there is another way in linux beside above answer:

sudo ln -s $SPARK_HOME/python/pyspark /usr/local/lib/python2.7/site-packages

Python will read module from /path/to/your/python/site-packages at last




回答2:


pyspark module is not include in your python

Try this instead

import os
import sys

os.environ['SPARK_HOME'] = "/usr/local/Cellar/apache-spark/2.1.0/libexec/"

sys.path.append("/usr/local/Cellar/apache-spark/2.1.0/libexec/python")
sys.path.append("/usr/local/Cellar/apache-spark/2.1.0/libexec/python/lib/py4j-0.10.4-src.zip")

try:
    from pyspark import SparkContext
    from pyspark import SparkConf

except ImportError as e:
    print ("error importing spark modules", e)
    sys.exit(1)

sc = SparkContext('local[*]','PySpark')

if you don't want that, include them into your system PATH. And don't forget to include the python path.

export SPARK_HOME=/usr/local/Cellar/apache-spark/2.1.0/libexec/
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH
export PATH=$SPARK_HOME/python:$PATH


来源:https://stackoverflow.com/questions/41735502/spark-installation-and-configuration-on-macos-importerror-no-module-named-pyspa

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!