问题
After spark installation 2.3 and setting the following env variables in .bashrc (using gitbash)
HADOOP_HOME
SPARK_HOME
PYSPARK_PYTHON
JDK_HOME
executing $SPARK_HOME/bin/spark-submit is displaying the following error.
Error: Could not find or load main class org.apache.spark.launcher.Main
I did some research checking in stackoverflow and other sites, but could not figure out the problem.
Execution environment
- Windows 10 Enterprise
- Spark version - 2.3
- Python version - 3.6.4
Can you please provide some pointers?
回答1:
I had that error message. It probably may have several root causes but this how I investigated and solved the problem (on linux):
- instead of launching
spark-submit
, try usingbash -x spark-submit
to see which line fails. - do that process several times ( since spark-submit calls nested scripts ) until you find the underlying process called : in my case something like :
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -cp '/opt/spark-2.2.0-bin-hadoop2.7/conf/:/opt/spark-2.2.0-bin-hadoop2.7/jars/*' -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name 'Spark shell' spark-shell
So, spark-submit launches a java process and can't find the org.apache.spark.launcher.Main class using the files in /opt/spark-2.2.0-bin-hadoop2.7/jars/*
(see the -cp option above). I did an ls in this jars folder and counted 4 files instead of the whole spark distrib (~200 files).
It was probably a problem during the installation process. So I reinstalled spark, checked the jar folder and it worked like a charm.
So, you should:
- check the
java
command (cp option) - check your jars folder ( does it contain ths at least all the spark-*.jar ?)
Hope it helps.
回答2:
I also suffered with the same problem but the solution is we are in lack of some basic file which is need so please delete the Spark folder from your Cdrive and install again
来源:https://stackoverflow.com/questions/50435286/spark-installation-error-could-not-find-or-load-main-class-org-apache-spark-l