问题
I am trying to setup an edge node to a cluster in my work place. The cluster is CDH 5.* Hadoop Yarn. It has it's own internal private high speed network. The edge node is outside the private network.
I ran the steps for hadoop client setup and configured the core-site.xml
sudo apt-get install hadoop-client
Since the cluster is hosted on it's own private network the IP addresses in the internal network are different. 10.100.100.1 - Namemode 10.100.100.2 - Data Node 1 10.100.100.4 - Data Node 2 100.100.100.6 - Date Node 3
To handle this I requested the cluster admin to add the following properties to the hdfs-site.xml on the namenode so that the listening ports are not just open to internal IP range:
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.http-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.https-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
After this setting is done and the services are restarted. I am able to run the following command: hadoop fs -ls /user/hduser/testData/XML_Flows/test/test_input/*
This works fine. But when I try to cat the file, I get the following error:
*administrator@administrator-Virtual-Machine:/etc/hadoop/conf.empty$ hadoop fs -cat /user/hduser/testData/XML_Flows/test/test_input/*
*15/05/04 15:39:02 WARN hdfs.BlockReaderFactory: I/O error constructing remote block reader.
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3035)
at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:744)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:659)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:327)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:574)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:797)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:844)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:78)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:104)
at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:99)
at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)
at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)
at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)
at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)
at org.apache.hadoop.fs.shell.Command.run(Command.java:154)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
15/05/04 15:39:02 WARN hdfs.DFSClient: Failed to connect to /10.100.100.6:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.100.100.6:50010]*
Same error message is repeated multiple times.
I copied the rest of the xml's i.e. hdfs-site.xml, yarn-site.xml, mapred-site.xml from the cluster data nodes to be on the safer side. But still I got the same error. Does anyone have any idea about this error or how to make edge nodes work on clusters running on private network.
The username of the edge node is "administrator" whereas the cluster is configured using "hduser" id. Could this be a problem ? I have configured password less login between the edge node and the name node.
来源:https://stackoverflow.com/questions/30028998/configure-edge-node-to-launch-hadoop-jobs-on-cluster-running-on-a-private-networ