Override hadoop's mapreduce.fileoutputcommitter.marksuccessfuljobs in oozie

时光怂恿深爱的人放手 提交于 2019-11-28 04:12:48

问题


<property>
<name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
<value>false</value>
</property>

I want to override the above property to true. The property needs to be false for the rest of the jobs on the cluster, but I need, in my oozie workflow, hadoop to create _SUCCESS file in the output directory after the completion of job. Its a hive action in the workflow which writes output. Please help.


回答1:


Hive unfortunately overrides this capability by setting it's own NullOutputComitter:

conf.setOutputCommitter(NullOutputCommitter.class);

see

src/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java
src/shims/src/common-secure/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java

Looks like you'll have to create the flag manually. We have filed HIVE-3700 for this.




回答2:


You can add 'dfs' command to your hive script, like

dfs -touchz '$table_base_path'/dt='${partition}'/_SUCCESS

https://archive.cloudera.com/cdh4/cdh/4/hive/language_manual/cli.html




回答3:


I ran into the same issue and ended up using a shell action to create the flag.

Here's a full example: http://nathan.vertile.com/blog/2014/09/02/oozie-data-pipeline-done-flag/



来源:https://stackoverflow.com/questions/13017433/override-hadoops-mapreduce-fileoutputcommitter-marksuccessfuljobs-in-oozie

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!