问题
I am just getting started on spark and am running it on standalone mode over amazon EC2 instance. I was trying examples mentioned in the documentation and while going through this example called Simple App I keep getting this error: NameError: name 'numAs' is not defined
from pyspark import SparkContext
logFile = "$YOUR_SPARK_HOME/README.md" # Should be some file on your system
sc = SparkContext("local", "Simple App")
logData = sc.textFile(logFile).cache()
numAs = logData.filter(lambda s: 'a' in s).count()
numBs = logData.filter(lambda s: 'b' in s).count()
print "Lines with a: %i, lines with b: %i" % (numAs, numBs)
How do I integrate an editor into spark instead of using this dynamic python shell? Why do I keep getting this error?
Thanks for any help/guidance.
回答1:
put your all your python code in a .py file, then submit the .py file like below:
# Run a Python application on a Spark Standalone cluster
./bin/spark-submit \
--master spark://207.184.161.138:7077 \
examples/src/main/python/pi.py \
1000
read here:
Submitting Applications
try these examples, really helping:
- https://spark.apache.org/examples.html
- https://github.com/apache/spark/tree/master/examples/src/main/python
来源:https://stackoverflow.com/questions/30617759/error-while-running-standalone-app-example-in-python-using-spark