NoClassDefFoundError com.apache.hadoop.fs.FSDataInputStream when execute spark-shell

前端 未结 14 1126
北荒
北荒 2020-11-30 02:07

I\'ve downloaded the prebuild version of spark 1.4.0 without hadoop (with user-provided Haddop). When I ran the spark-shell command, I got this error:

> E         


        
14条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-30 02:25

    The "without Hadoop" in the Spark's build name is misleading: it means the build is not tied to a specific Hadoop distribution, not that it is meant to run without it: the user should indicate where to find Hadoop (see https://spark.apache.org/docs/latest/hadoop-provided.html)

    One clean way to fix this issue is to:

    1. Obtain Hadoop Windows binaries. Ideally build them, but this is painful (for some hints see: Hadoop on Windows Building/ Installation Error). Otherwise Google some up, for instance currently you can download 2.6.0 from here: http://www.barik.net/archive/2015/01/19/172716/
    2. Create a spark-env.cmd file looking like this (modify Hadoop path to match your installation): @echo off set HADOOP_HOME=D:\Utils\hadoop-2.7.1 set PATH=%HADOOP_HOME%\bin;%PATH% set SPARK_DIST_CLASSPATH=
    3. Put this spark-env.cmd either in a conf folder located at the same level as your Spark base folder (which may look weird), or in a folder indicated by the SPARK_CONF_DIR environment variable.

提交回复
热议问题