Combiner without Reducer in Hadoop

谁说胖子不能爱 提交于 2019-12-13 14:19:10

问题


Can I write a Hadoop code that has only Mappers and Combiners (i.e. mini-reducers with no reducer)?

job.setMapperClass(WordCountMapper.class);
job.setCombinerClass(WordCountReducer.class);

conf.setInt("mapred.reduce.tasks", 0);

I was trying to do so but I always see that I have one reduce task on the job tracker link

Launched reduce tasks = 1

How can I delete reducers while keeping combiners? is that possible?


回答1:


In the case you describe you should use Reducers. Use as key: Context.getInputSplit().getPath() + Context.getInputSplit().getStart() - this combination is unique for each Mapper.




回答2:


You need to tell your job that you don't care about the reducer: JobConf.html#setNumReduceTasks(int)

// new Hadoop API
jobConf.setNumReduceTasks(0);

// old Hadoop API
job.setNumReduceTasks(0);

You can achieve the something with IdentityReducer.

Performs no reduction, writing all input values directly to the output.

I'm not sure whether you can keep combiners but I will start with the previous lines.



来源:https://stackoverflow.com/questions/22173788/combiner-without-reducer-in-hadoop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!