spark <console>:12: error: not found: value sc

匿名 (未验证) 提交于 2019-12-03 01:54:01

问题:

I wrote the following:

val a = 1 to 10000 val b = sc.parallelize(a) 

and it shows error saying:

:12: error: not found: value sc 

Any help?

回答1:

It happens when your classpath is not correct. This is an open issue in Spark at the moment.

> spark-shell   ... ... 14/08/08 18:41:50 INFO SparkILoop: Created spark context.. Spark context available as sc.  scala> sc res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@2c1c5c2e  scala> :cp /tmp Added '/tmp'.  Your new classpath is: ...  scala> sc :8: error: not found: value sc 

You may need to correct your classpath from outside the repl.



回答2:

In my case I have spark installed on local windows system and I observed the same error but it was because of below issue

Issue:Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable.

This was because of permission issue.I resolved it by changing the permissions using below command.Though log says "on hdfs" this is on windows system

E:\winutils\bin\winutils.exe chmod 777 E:\tmp\hive



回答3:

You get this error, because sc is not defined. I would try:

sc = SparkContext(appName = "foo") 

Another thing that usually happens to me is not getting a Kerberos ticket in the cluster, because I forgot too.


As for the "open issue in Spark" mentioned by Solnanki, I am pretty sure this is not the case any more.



回答4:

First check the log file after spark-shell command run whether SparkContext is initinalized as sc. if SparkContext is not initialized properly

you have to set the IP address in spark environment.

Open the env file in conf/spark.env.sh and add the below line

export SPARK_LOCAL_IP="127.0.0.1"



回答5:

I hit this error when trying out Spark on Cloudera Quickstart VM. Turned out to be an hdfs file permission issue on /user/spark.

I could not switch to user "spark", I got a user not available error. Changing file permissions with the below command solved it for me.

sudo -u hdfs hadoop fs -chmod -R 1777 /user/spark  scala> val data = 1 to 10000 data: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170... scala> val distData = sc.parallelize(data) distData: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at :14 


回答6:

i faced the same problem. In my case the JAVA_HOME was not set properly which cause this issue. surprisingly SPARK would start but the sc context had issues creating an instance. When i fixed the JAVA_HOME to point to the correct java directory, this issue was resolved. I had to close the session and re-open a new one in order to ensure the path is updated and fresh session is turned on.

I hope this helps.



回答7:

As stated in this thread, one solution may be to switch off permissions checking.

In cloudera manager, go to hdfs configuration under advanced and put the following code in "HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" :

dfs.permissionsfalse

After that, it is necessary to restart the HDFS component.

It worked for me. It might not be appropriate for a production environment, however.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!