问题
I'm trying to run Spring Boot YARN sample (https://spring.io/guides/gs/yarn-basic/ on Windows). In application.yml I changed fsUri and resourceManagerHost to point to my VM's host 192.168....
But when I'm trying to run application Exceprion appears:
DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1508)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1284)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
[2017-05-27 19:59:49.570] boot - 7728 INFO [Thread-5] --- DFSClient: Abandoning BP-646365587-10.0.2.15-1495898351938:blk_1073741830_1006
[2017-05-27 19:59:49.602] boot - 7728 INFO [Thread-5] --- DFSClient: Excluding datanode DatanodeInfoWithStorage[10.0.2.15:50010,DS-f909ec7a-8374-4cdd-9cfc-0e778810d98c,DISK]
[2017-05-27 19:59:49.647] boot - 7728 WARN [Thread-5] --- DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
It means that DataNode isn't accessible from my host machine. For that reason I added to hdfs-site.xml
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
But it still throws that exception.
I've got Hadoop 2.8.0 running on my VM. Here's conf. files:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/datanode</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-
disk-percentage</name>
<value>99</value>
</property>
</configuration>
回答1:
Your core-site.xml should point to Namenode address but currently its pointing to 0.0.0.0 which means all addresses on the local machine. This will create ambiguous result as each machine shall be treated as Namenode.
Namenode should be only one in a hadoop cluster.
Replacing the 0.0.0.0 with the Namenode's ip or hostname should resolve the issue you are facing.
回答2:
Spring connected to YARN after changed 0.0.0.0:9000 to [VM's IP]:9000 in core-site.xml. Thanks to @RameshMaharjan
来源:https://stackoverflow.com/questions/44219690/spring-boot-yarn-doesnt-run-on-hadoop-2-8-0-client-cannot-access-datanode