1 现象说明
新搭建的hadoop 3.1.1 的环境,在启动Hadoop时,通过jps目录发现Slave上没有datanode进程。如下:
[cndba@hadoopmaster ~]$ jps 23234 ResourceManager 22998 SecondaryNameNode 23575 Jps 22683 NameNode [cndba@hadoopslave1 ~]$ jps 9682 Jps 9535 NodeManager [cndba@hadoopslave2 ~]$ jps 9356 Jps 9199 NodeManager
2 clusterID不匹配导致的问题
网上搜了下,网上的说法都是由于进行hadoop格式化的时候没有事先结束所有进程,或者多次进行了format导致的datanode的clusterID 和 namenode的clusterID不匹配,从而在启动后没有datanode进程。
解决方法有两种:
方案一: 保留现有数据
- 用NameNode节点的~/dfs/name/current/VERSION 中的namenode的clusterID替换所有datanode节点机器中~/dfs/data/current/VERSION中的clusterID。
- 重启启动hadoop:start-all.sh
这种方式不影响现有的数据,避免了重新的格式化。
方案二: 重新格式化
- 执行./stop-all.sh关闭集群
- 删除存放hdfs数据块的文件夹(hadoop/tmp/),然后重建该文件夹
- 删除hadoop下的日志文件logs
- 执行hadoop namenode -format格式化hadoop
- 重启hadoop集群
3 其他情况
我这里属于另外的情况,并不是clusterID不匹配导致的问题。
重新查看了下启动日志,原来是用户名敲错了:
[cndba@hadoopmaster hadoop]$ start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C to abort. Starting namenodes on [hadoopmaster] Starting datanodes ERROR: datanode can only be executed by cbdba. Starting secondary namenodes [hadoopmaster] WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting resourcemanager WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting nodemanagers WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. [cndba@hadoopmaster hadoop]$
hadoop-env.sh 文件中敲成了cbdba:
export HDFS_DATANODE_USER="cbdba"
修改成cndba后继续启动:
[cndba@hadoopmaster hadoop]$ start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C to abort. Starting namenodes on [hadoopmaster] Starting datanodes hadoopslave2: ERROR: Cannot set priority of datanode process 12752 hadoopslave1: ERROR: Cannot set priority of datanode process 13164 Starting secondary namenodes [hadoopmaster] WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting resourcemanager WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting nodemanagers WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. [cndba@hadoopmaster hadoop]$
又报错:
hadoopslave2: ERROR: Cannot set priority of datanode process 12752
上从库查看datanode日志:
************************************************************/ 2019-01-23 05:23:23,501 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT] 2019-01-23 05:23:23,619 ERROR org.apache.hadoop.conf.Configuration: error parsing conf hdfs-site.xml com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character ' ' (code 12288 / 0x3000) in epilog; expected '<' at [row,col,system-id]: [50,17,"file:/home/cndba/hadoop/etc/hadoop/hdfs-site.xml"] at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:653) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2133) at com.ctc.wstx.sr.BasicStreamReader.closeContentTree(BasicStreamReader.java:2991) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2734)
这里表面上看是权限问题,但从日志看是hdfs-site.xml 配置文件有问题,修改配置文件后重启系统,正常:
[cndba@hadoopmaster hadoop]$ start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as cndba in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C to abort. Starting namenodes on [hadoopmaster] Starting datanodes Starting secondary namenodes [hadoopmaster] WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting resourcemanager WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. Starting nodemanagers WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR. [cndba@hadoopmaster hadoop]$
查看进程,也没有问题:
[cndba@hadoopmaster hadoop]$ jps 13030 SecondaryNameNode 12791 NameNode 13271 ResourceManager 13752 Jps [cndba@hadoopmaster hadoop]$ [cndba@hadoopslave2 logs]$ jps 13587 Jps 13302 DataNode 13422 NodeManager [cndba@hadoopslave2 logs]$ [root@hadoopslave1 ~]# jps 13876 NodeManager 14026 Jps 13756 DataNode [root@hadoopslave1 ~]#
来源:https://www.cnblogs.com/find1/p/11178779.html