How to run a script in PySpark

后端 未结 5 2030
太阳男子
太阳男子 2020-12-14 15:59

I\'m trying to run a script in the pyspark environment but so far I haven\'t been able to. How can I run a script like python script.py but in pyspark? Thanks

相关标签:
5条回答
  • 2020-12-14 16:27

    pyspark 2.0 and later execute script file in environment variable PYTHONSTARTUP, so you can run:

    PYTHONSTARTUP=code.py pyspark
    

    Compared to spark-submit answer this is useful for running initialization code before using the interactive pyspark shell.

    0 讨论(0)
  • 2020-12-14 16:33

    You can do: ./bin/spark-submit mypythonfile.py

    Running python applications through pyspark is not supported as of Spark 2.0.

    0 讨论(0)
  • 2020-12-14 16:34

    You can execute "script.py" as follows

    pyspark < script.py
    

    or

    # if you want to run pyspark in yarn cluster
    pyspark --master yarn < script.py
    
    0 讨论(0)
  • 2020-12-14 16:42

    Just spark-submit mypythonfile.py should be enough.

    0 讨论(0)
  • 2020-12-14 16:48

    Spark environment provides a command to execute the application file, be it in Scala or Java(need a Jar format), Python and R programming file. The command is,

    $ spark-submit --master <url> <SCRIPTNAME>.py.

    I'm running spark in windows 64bit architecture system with JDK 1.8 version.

    P.S find a screenshot of my terminal window. Code snippet

    0 讨论(0)
提交回复
热议问题