问题
I am very frustrated by Spark. An evening wasted thinking that I was doing something wrong but I have uninstalled and reinstalled several times, following multiple guides that all indicate a very similar path.
On cmd prompt, I am trying to run:
pyspark
or
spark-shell
The steps I followed include downloading a pre-built package from:
https://spark.apache.org/downloads.html
including spark 2.0.2 with hadoop 2.3 and spark 2.1.0 with hadoop 2.7.
Neither work and I get this error:
'Files\Spark\bin\..\jars""\' is not recognized as an internal or external command,
operable program or batch file.
Failed to find Spark jars directory.
You need to build Spark before running this program.
I've setup my environment variables fine as well utilising the winutils.exe trick but these seem unrelated to the problem at hand.
I can't be the only one who's stuck on this problem. Anyone know a work around for getting this program to work in windows?
回答1:
I've just found an answer in one of the answers to this question:
Why does spark-submit and spark-shell fail with "Failed to find Spark assembly JAR. You need to build Spark before running this program."?
The following answer worked for me and is totally counter-intuitive:
"On Windows, I found that if it is installed in a directory that has a space in the path (C:\Program Files\Spark) the installation will fail. Move it to the root or another directory with no spaces."
回答2:
This problem is caused by your environment variable settings, in fact you probably put the SPARK_HOME value as 'Program Files\Spark\bin", which has 2 issue :
- you have to remove the bin, spark home is just 'Program Files\Spark\'
- since the path to spark home contains a white space, it causes a problem therefore you can set it as 'Progra~1\Spark\'
来源:https://stackoverflow.com/questions/42742703/spark-on-windows-10-files-spark-bin-jars-is-not-recognized-as-an-intern