Multiple Input Paths configuration in OOZIE

杀马特。学长 韩版系。学妹 提交于 2019-12-11 05:15:32

问题


I am trying to configure a Mapreduce job in oozie . This job has two different input formats and two input data folders. I used this post How to configure oozie workflow for multi-input path with multiple mappers and added these properties to my workflow.xml :

        <property>
                <name>mapred.input.dir.formats</name>
                <value>folder/data/*;org.apache.hadoop.mapred.SequenceFileInputFormat\,data/*;org.apache.hadoop.mapred.TextInputFormat</value>
            </property>

            <property>
                <name>mapred.input.dir.mappers</name>
                <value>folder/data/*;....PublicMapper\,data/*;....PublicMapper</value>
            </property>

but when the job is launched i have the following error: " No input paths specified in job".

Is there anyone that can help me ?

thks


回答1:


You need to set some additional properties:

<property>
  <name>mapreduce.inputformat.class</name>
  <value>org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat</value>
</property>
<property>
  <name>mapreduce.map.class</name>
  <value>org.apache.hadoop.mapreduce.lib.input.DelegatingMapper</value>
</property>



回答2:


I faced the same issue today, so I used the following properties.

<property>
  <name>mapreduce.inputformat.class</name>
  <value>org.apache.hadoop.mapreduce.lib.input.DelegatingInputFormat</value>
</property>
<property>
  <name>mapreduce.map.class</name>
  <value>org.apache.hadoop.mapreduce.lib.input.DelegatingMapper</value>
</property>

<property>
  <name>mapreduce.input.multipleinputs.dir.formats</name>
  <value>/first/input/path;org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat,/second/input/path;org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat</value>
</property>
<property>
  <name>mapreduce.input.multipleinputs.dir.mappers</name>
  <value>/first/input/path;com.first.Mapper,/second/input/path;com.second.Mapper</value>
</property>

The difference is instead of mapred.input.dir.formats and mapred.input.dir.mappers which is part of the old map-reduce API I used mapreduce.input.multipleinputs.dir.formats and mapreduce.input.multipleinputs.dir.mappers respectively. The code worked just fine after that. I ran it on Hadoop 1.2.1 and Oozie 3.3.2.



来源:https://stackoverflow.com/questions/20194472/multiple-input-paths-configuration-in-oozie

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!