问题
I used the jupyter notebook, pyspark, then, my first command was:
rdd = sc.parallelize([2, 3, 4])
Then, it showed that
NameError Traceback (most recent call last)
<ipython-input-1-c540c4a1d203> in <module>()
----> 1 rdd = sc.parallelize([2, 3, 4])
NameError: name 'sc' is not defined.
How to fix this error 'sc' is not defined.
回答1:
Have you initialized the SparkContext
?
You could try this:
#Initializing PySpark
from pyspark import SparkContext, SparkConf
# #Spark Config
conf = SparkConf().setAppName("sample_app")
sc = SparkContext(conf=conf)
来源:https://stackoverflow.com/questions/38515369/jupyter-notebook-nameerror-name-sc-is-not-defined