Error while running standalone app example in python using spark

核能气质少年 提交于 2019-12-13 04:57:12

问题


I am just getting started on spark and am running it on standalone mode over amazon EC2 instance. I was trying examples mentioned in the documentation and while going through this example called Simple App I keep getting this error: NameError: name 'numAs' is not defined

from pyspark import SparkContext

logFile = "$YOUR_SPARK_HOME/README.md"  # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()

numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()

print "Lines with a: %i, lines with b: %i" % (numAs, numBs)

How do I integrate an editor into spark instead of using this dynamic python shell? Why do I keep getting this error?

Thanks for any help/guidance.


回答1:


put your all your python code in a .py file, then submit the .py file like below:

# Run a Python application on a Spark Standalone cluster
./bin/spark-submit \
  --master spark://207.184.161.138:7077 \
  examples/src/main/python/pi.py \
  1000

read here:

Submitting Applications

try these examples, really helping:

  • https://spark.apache.org/examples.html
  • https://github.com/apache/spark/tree/master/examples/src/main/python


来源:https://stackoverflow.com/questions/30617759/error-while-running-standalone-app-example-in-python-using-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!