Hadoop accessing 3rd party libraries from local file system of a Hadoop node

匆匆过客 提交于 2019-12-01 09:09:18

问题


I have a jar file on all my Hadoop nodes at /home/ubuntu/libs/javacv-0.9.jar , with some other jar files.

When my MapReduce application is executing on Hadoop nodes, I am getting this exception

java.io.FileNotFoundException: File does not exist hdfs://192.168.0.18:50000/home/ubuntu/libs/javacv-0.9.jar

How can I resolve this exception? How can my jar running in Hadoop access 3rd party libraries from the local file system of the Hadoop node?


回答1:


You need to copy your file to HDFS and not to the local filesystem.

To copy files to HDFS you need to use:

hadoop fs -put localfile hdfsPath

Other option is to change the file path to:

file:///home/ubuntu/libs/javacv-0.9.jar

To add jar files to the classpath, take a look at DistributedCache:

DistributedCache.addFileToClassPath(new Path("file:///home/ubuntu/libs/javacv-0.9.jar"), job);

You may need to iterate over all jar files in that directory.




回答2:


Another option would be to use distributed cache's addFileToClassPath(new Path("/myapp/mylib.jar"), job); to submit the Jar files that should be added to the classpath of your mapper and reducer tasks.

Note: Make sure you copy the jar file to HDFS first.

You could even add jar files to class path by using hadoop command line argument -libjars <jar_file>.

Note: Make sure your MapReduce application implements ToolRunner to allow -libjars option from command line.



来源:https://stackoverflow.com/questions/28213244/hadoop-accessing-3rd-party-libraries-from-local-file-system-of-a-hadoop-node

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!