I have set up a multi node Hadoop Cluster. The NameNode and Secondary namenode runs on the same machine and the cluster has only one Datanode. All the nodes are configured o
It is probably because the cluster ID of the datanodes and the namenodes or node manager do not match. The cluster ID can be seen in the VERSION file found in both the namenode and datanodes .
This happens when you format your namenode and then restart the cluster but the datanodes still try connecting using the previous clusterID . to be successfully connected you need the correct IP address and also a matching cluster ID on the nodes.
So try reformatting the namenode and datanodes or just configure the datanodes and namenode on newly created folders.
That should solve your problem.
Deleting the files from the current datanodes folder will also remove the old VERSION file and will request for a new VERSION file while reconnecting with the namenode.
Example you datanode directory in the configuration is /hadoop2/datanode
$ rm -rvf /hadoop2/datanode/*
And then restart services If you do reformat your namenode do it before this step. Each time you reformat your namenode it gets a new ID and that ID is randomly generated and will not match the old ID in your datanodes
So every time follow this sequence
if you Format namenode then Delete the contents of datanode directory OR configure datanode on newly created directory Then start your namenode and the datanodes