In Hadoop we can increment counter in map/reduce task, it looks like this:
...
context.getCounter(MyCountersEnum.SomeCounter).increment(1);
...
Than you can find their value in log.
How do you access them from code after job completes?
What is Hadoop API to read counter value?
Counters represent global counters, defined either by the Map-Reduce framework or applications.
Each Counter can be of any Enum type. You can define counter as an enum in Driver class
static enum UpdateCount{
CNT
}
And then increment the counter in map/reduce task
public class CntReducer extends Reducer<IntWritable, Text, IntWritable, Text>{
public void reduce(IntWritable key,Iterable<Text> values,Context context) {
//do something
context.getCounter(UpdateCount.CNT).increment(1);
}
}
and access them in Driver class
public int run(String[] args) throws Exception {
.
.
.
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(job,in );
FileOutputFormat.setOutputPath(job, out);
job.waitForCompletion(true);
c = job.getCounters().findCounter(UpdateCount.CNT).getValue();
//Print "c"
}
}
c gives the counter value.
You can find an example here
I just found the answer here.
You need a job object to access the counters:
Counters counters = job.getCounters();
Counter counter = counters.findCounter(MyCountersEnum.SomeCounter);
System.out.println(counter.getDisplayName() + ": " + counter.getValue());
来源:https://stackoverflow.com/questions/27325536/how-to-access-hadoop-counters-values-via-api