I\'m running Hadoop 1.1.2 on a cluster with 10+ machines. I would like to nicely scale up and down, both for HDFS and MapReduce. By \"nicely\", I mean that I require that da
If you have not set dfs exclude file before, follow 1-3. Else start from 4.
bin/hadoop dfsadmin -refreshNodes. This forces the NameNode to reread the exclude file and start the decommissioning process.bin/hadoop mradmin -refreshNodes"Decommission complete for node XXXX.XXXX.X.XX:XXXXX" will appear in the NameNode log files when it finishes decommissioning, at which point you can remove the nodes from the cluster. bin/hadoop dfsadmin -report to verify. Stop the datanode and tasktracker process on the excluded node(s). To add a node as datanode and tasktracker see Hadoop FAQ page
EDIT : When a live node is to be removed from the cluster, what happens to the Job ?
The jobs running on a node to be de-commissioned would get affected as the tasks of the job scheduled on that node(s) would be marked as KILLED_UNCLEAN (for map and reduce tasks) or KILLED (for job setup and cleanup tasks). See line 4633 in JobTracker.java for details. The job will be informed to fail that task. Most of the time, Job tracker will reschedule execution. However, after many repeated failures it may instead decide to allow the entire job to fail or succeed. See line 2957 onwards in JobInProgress.java.