I\'m trying to run a script in the pyspark environment but so far I haven\'t been able to. How can I run a script like python script.py but in pyspark? Thanks
pyspark 2.0 and later execute script file in environment variable PYTHONSTARTUP
, so you can run:
PYTHONSTARTUP=code.py pyspark
Compared to spark-submit
answer this is useful for running initialization code before using the interactive pyspark shell.
You can do: ./bin/spark-submit mypythonfile.py
Running python applications through pyspark
is not supported as of Spark 2.0.
You can execute "script.py" as follows
pyspark < script.py
or
# if you want to run pyspark in yarn cluster
pyspark --master yarn < script.py
Just spark-submit mypythonfile.py
should be enough.
Spark environment provides a command to execute the application file, be it in Scala or Java(need a Jar format), Python and R programming file. The command is,
$ spark-submit --master <url> <SCRIPTNAME>.py
.
I'm running spark in windows 64bit architecture system with JDK 1.8 version.
P.S find a screenshot of my terminal window. Code snippet