I installed Spark on Windows, and I\'m unable to start pyspark
. When I type in c:\\Spark\\bin\\pyspark
, I get the following error:
Spark <= 2.1.0 is not compatible with Python 3.6. See this issue, which also claims that this will be fixed with the upcoming Spark release.
I resolved this issue using one change in the pythons script.
I have place below piece of code in python script named serializers.py , location is c:\your-installation-dir\spark-2.0.2-bin-hadoop-2.7\python\pyspark\
and below line to be replace at Line number 381.
cls = _old_namedtuple(*args, **kwargs, verbose=False, rename=False, module=None)
And then run pyspark into your command line this will work..
The Possible Issues faced when running Spark on Windows is, of not giving proper Path or by using Python 3.x to run Spark.
So,
I wanted to extend on Indrajeet's answer, since he mentioned line numbers instead of the exact location of the code. Please see this in addition to his answer for further clarification.
def _hijack_namedtuple():
""" Hack namedtuple() to make it picklable """
# hijack only one time
if hasattr(collections.namedtuple, "__hijack"):
return
global _old_namedtuple # or it will put in closure
def _copy_func(f):
return types.FunctionType(f.__code__, f.__globals__, f.__name__,
f.__defaults__, f.__closure__)
_old_namedtuple = _copy_func(collections.namedtuple)
def namedtuple(*args, **kwargs):
# cls = _old_namedtuple(*args, **kwargs)
cls = _old_namedtuple(*args, **kwargs, verbose=False, rename=False, module=None)
return _hack_namedtuple(cls)
!!! EDIT 6th Mar 2017!! This did fix the original issue, but I don't think this will make spark 2.1 compatible with 3.6 yet, there were more collisions further down. As a result I used conda to create a python 35 virtual environment and it worked like a charm.
(Windows, assuming you have env variables in place)
>conda create -n py35 python=3.5
>activate py35
>pyspark
Spark 2.1.0 doesn't support python 3.6.0. To solve this change your python version in anaconda environment. Run following command in your anaconda env
conda create -n py35 python=3.5 anaconda
activate py35