问题
I know I can control the max memory for a map (or reduce) task by setting JVM parameters. But I am wondering if there is a way to see current memory usage of a task?
回答1:
enable remote HPROF profiling. HPROF is a profiling tool that comes with the JDK that, although basic, can give valuable information about a program’s CPU and heap usage. To use it, you can try this in your code:
conf.setBoolean("mapred.task.profile", true);
conf.set("mapred.task.profile.params", "-agentlib:hprof=cpu=samples," +
"heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s");
conf.set("mapred.task.profile.maps", "0-2");
conf.set("mapred.task.profile.reduces", ""); // no reduces
See "Hadoop The Definitve Guide", Chapter 5 -> "Tuning a Job" -> "Profiling Tasks" for more details.
来源:https://stackoverflow.com/questions/12314762/how-to-check-memory-footprint-of-map-task-in-hadoop