Spark Unable to load native-hadoop library for your platform

南楼画角 提交于 2019-11-30 06:53:36

问题


I'm a dummy on Ubuntu 16.04, desperately attempting to make Spark work. I've tried to fix my problem using the answers found here on stackoverflow but I couldn't resolve anything. Launching spark with the command ./spark-shell from bin folder I get this message

WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable".

I'm using Java version is

java version "1.8.0_101
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode).

Spark is the latest version: 2.0.1 with Hadoop 2. 7. I've also retried with an older package of Spark, the 1.6.2 with Hadoop 2.4 but I get the same result. I also tried to install Spark on Windows but it seems harder than doing it on Ubuntu.

I also tried to run some commands on Spark from my laptop: I can define an object, I can create an RDD and store it in cache and I can use function like .map(), but when I try to run the function .reduceByKey() I receive several strings of error messages.

May be it's the Hadoop library which is compiled for 32bits, while I'm on 64bit?

Thanks.


回答1:


Steps to fix:

  • download Hadoop binaries
  • unpack to directory of your choice
  • set HADOOP_HOME to point to that directory.
  • add $HADOOP_HOME/lib/native to LD_LIBRARY_PATH.



回答2:


  1. Download hadoop binary (link) and put it in your home directory (you can choose a different hadoop version if you like and change the next steps accordingly)
  2. Unzip the folder in your home directory using the following command. tar -zxvf hadoop_file_name
  3. Now add export HADOOP_HOME=~/hadoop-2.8.0 to your .bashrc file. Open a new terminal and try again.

Source: Install PySpark on ubuntu



来源:https://stackoverflow.com/questions/40015416/spark-unable-to-load-native-hadoop-library-for-your-platform

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!