Hadoop: How can I prevent failed tasks from making the whole job fail?

问题

I'm running a hadoop job with, say, 1000 tasks. I need the job to attempt to run every task but many of the tasks will not complete and will instead throw an exception. I cannot change this behavior, but I still need the data obtained from the tasks that did not fail.

How can I make sure Hadoop goes through with all the 1000 tasks despite encountering a large number of failed tasks?

回答1:

In your case, you could set the maximum percentage of tasks that are allowed to fail without triggering job failure. Map tasks and reduce tasks are controlled independently, using the

mapred.max.map.failures.percent 
mapred.max.reduce.failures.percent

properties. So if you want 70% of tasks result even if 30% fails you could do so with above properties.

来源：https://stackoverflow.com/questions/26452565/hadoop-how-can-i-prevent-failed-tasks-from-making-the-whole-job-fail

标签

java

Hadoop

configuration

cluster-computing

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!