Send executable jar to hadoop cluster and run as “hadoop jar”

对着背影说爱祢 提交于 2019-12-11 06:14:01

问题


I commonly make a executable jar package with a main method and run by the commandline "hadoop jar Some.jar ClassWithMain input output"

In this main method, Job and Configuration may be configured and Configuration class has a setter to specify mapper or reducer class like conf.setMapperClass(Mapper.class).

However, In the case of submitting job remotely, I should set jar and Mapper or more classes to use hadoop client api.

job.setJarByClass(HasMainMethod.class);
job.setMapperClass(Mapper_Class.class);
job.setReducerClass(Reducer_Class.class);

I want to programmatically transfer jar in client to remote hadoop cluster and execute this jar like "hadoop jar" command to make main method specify mapper and reducer.

So how can I deal with this problem?


回答1:


hadoop is only a shell script. Eventually, hadoop jar will invoke org.apache.hadoop.util.RunJar. What hadoop jar do is helping you set up the CLASSPATH. So you can use it directly.

For example,

String input = "...";
String output = "...";
org.apache.hadoop.util.RunJar.main(
    new String[]{"Some.jar", "ClassWithMain", input, output});

However, you need to set the CLASSPATH correctly before you use it. A convenient way to get the correct CLASSPATH is hadoop classpath. Type this command and you will get the full CLASSPATH.

Then set up the CLASSPATH before you run your java application. For example,

export CLASSPATH=$(hadoop classpath):$CLASSPATH
java -jar YourJar.jar


来源:https://stackoverflow.com/questions/18394663/send-executable-jar-to-hadoop-cluster-and-run-as-hadoop-jar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!