Pyspark --py-files doesn't work

后端 未结 7 1365
轻奢々
轻奢々 2020-12-31 01:14

I use this as document suggests http://spark.apache.org/docs/1.1.1/submitting-applications.html

spsark version 1.1.0

./spark/bin/spark-submit --py-f         


        
7条回答
  •  盖世英雄少女心
    2020-12-31 01:43

    I was facing a similar kind of problem, My worker nodes could not detect the modules even though I was using the --py-files switch.

    There were couple of things I did - First I tried putting import statement after I created SparkContext (sc) variable hoping that import should take place after the module has shipped to all nodes but still it did not work. I then tried sc.addFile to add the module inside the script itself (instead of sending it as a command line argument) and afterwards imported the functions of the module. This did the trick at least in my case.

提交回复
热议问题