Adding jars to the classpath of the code that launches map reduce job

微笑、不失礼 提交于 2019-12-25 04:07:15

问题


I am trying to launch a map reduce job from an application that implements the Tool interface. The application does few other things which are like preconditions for the map reduce job.

This class use some third party libs, How do I add those jars to the classpath while running the jar using the command: hadoop jar < myjar > [args]

From this Cloudera's post I tried to set the HADOOP_CLASSPATH env var to the third party jar, but it did not work out. The third party jars mentioned above are only required by the class that launch the job and not by Mapper/Reducer classes. So I do not need to put them in Distributed Cache.

When I copy these third party jars that I need under $HADOOP_HOME/lib, it works, but I need a cleaner solultion.

Thanks in aniticipation.

Note - I know that putting all the third party jars in a lib directory in my-map-reduce-job.jar jar would work, but I do not have that liberty, the jar gets created using Maven and I want these third party jars outside of my-map-reduce-job.jar


回答1:


For future references - setting env var HADOOP_CLASSPATH on the client machine fron where you are launching the map reduce job is the way to go.

I figured out my mistake, I was exporting the HADOOP_CLASSPATH in wrong way. The seperator between the jars is platform dependent, for Unix, its colon(:)

export HADOOP_CLASSPATH=/path/to/my/jar1:/path/to/my/jar2 and then hadoop jar [mainClass] [args]

You might want to append your jars to the HADOOP_CLASSPATH env var if it has been predefined elsewhere. export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/path/to/my/jar1:/path/to/my/jar2



来源:https://stackoverflow.com/questions/27731065/adding-jars-to-the-classpath-of-the-code-that-launches-map-reduce-job

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!