问题
My Spark package is spark-2.2.0-bin-hadoop2.7.
I exported spark variables as
export SPARK_HOME=/home/harry/spark-2.2.0-bin-hadoop2.7
export PATH=$SPARK_HOME/bin:$PATH
I opened spark notebook by
pyspark
I am able to load packages from spark
from pyspark import SparkContext, SQLContext
from pyspark.ml.regression import LinearRegression
print(SQLContext)
output is
<class 'pyspark.sql.context.SQLContext'>
But my error is
print(sc)
"sc is undefined"
plz can anyone help me out ...!
回答1:
In pysparkShell, SparkContext
is already initialized as SparkContext(app=PySparkShell, master=local[*])
so you just need to use getOrCreate()
to set the SparkContext
to a variable as
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)
For coding purpose in simple local mode, you can do the following
from pyspark import SparkConf, SparkContext, SQLContext
conf = SparkConf().setAppName("test").setMaster("local")
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)
print(sc)
print(sqlContext)
来源:https://stackoverflow.com/questions/48761626/sc-is-not-defined-in-sparkcontext