hadoop cluster is using only master node or all nodes

时光怂恿深爱的人放手 提交于 2019-12-13 06:53:38

问题


I have created a 4-node hadoop cluster. I start all datanodes,namenode resource manager,etc.

To find whether all of my nodes are working or not, I tried the following procedure:

Step 1. I run my program when all nodes are active
Step 2. I run my program when only master is active.

The completion time in both cases were almost same.

So, I would like to know if there is any other means by which I can know how many nodes are actually used while running the program.


回答1:


Discussed in the chat. The problem is caused by incorrect Hadoop installation, in both cases job was started locally using LocalJobRunner.

As a recommendations:

  1. Install Hadoop using Ambari (http://ambari.apache.org/)
  2. Change platform to CentOS 6.4+
  3. Use Oracle JDK 7
  4. Be patient with host names and firewall
  5. Get familiar with the cluster commands for health diagnostics and default Hadoop WebUIs


来源:https://stackoverflow.com/questions/27028288/hadoop-cluster-is-using-only-master-node-or-all-nodes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!