I set up and configured a multi-node Hadoop cluster using this tutorial.
When I type in the start-all.sh command, it shows all the processes initializing properly as
This is for newer version of Hadoop (I am running 2.4.0)
In the file: hdfs-site.xml Look out for directory paths corresponding to dfs.namenode.name.dir dfs.namenode.data.dir
Hope this helps.
I configured hadoop.tmp.dir in conf/core-site.xml
I configured dfs.data.dir in conf/hdfs-site.xml
I configured dfs.name.dir in conf/hdfs-site.xml
Deleted everything under "/tmp/hadoop-/" directory
Changed file permissions from 777 to 755 for directory listed under dfs.data.dir
And the data node started working.
Follow these steps and your datanode will start again.
Instead of deleting everything under the "hadoop tmp dir", you can set another one. For example, if your core-site.xml has this property:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/data/tmp</value>
</property>
You can change this to:
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/data/tmp2</value>
</property>
and then scp core-site.xml to each node, and then "hadoop namenode -format", and then restart hadoop.
Error in datanode.log file
$ more /usr/local/hadoop/logs/hadoop-hduser-datanode-ubuntu.log
Shows:
java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop_tmp/hdfs/datanode: namenode clusterID = CID-e4c3fed0-c2ce-4d8b-8bf3-c6388689eb82; datanode clusterID = CID-2fcfefc7-c931-4cda-8f89-1a67346a9b7c
Solution: Stop your cluster and issue the below command & then start your cluster again.
sudo rm -rf /usr/local/hadoop_tmp/hdfs/datanode/*
Stop all the services - ./stop-all.sh Format all the hdfs tmp directory from all the master and slave. Don't forget to format from slave.
Format the namenode.(hadoop namenode -format)
Now start the services on namenode. ./bin/start-all.sh
This made a difference for me to start the datanode service.