What is the difference between single node & pseudo-distributed mode in Hadoop?

旧时模样 提交于 2019-12-31 08:15:07

问题


I'd like to know what is difference from the configuration point of view as well as theoretical point of view?

Do these two modes use different port numbers? or any other difference?


回答1:


My 2 cents.

Single node setup (standalone setup)

By default, Hadoop is configured to run in a non-distributed or standalone mode, as a single Java process. There are no daemons running and everything runs in a single JVM instance. HDFS is not used.

You don't have to do anything as far as configuration is concerned, except the JAVA_HOME. Just download the tarball, unzip it, and you are good to go.

Pseudo-distributed mode

The Hadoop daemons run on a local machine, thus simulating a cluster on a small scale. Different Hadoop daemons run in different JVM instances, but on a single machine. HDFS is used instead of local FS.

As far as pseudo-distributed setup is concerned, you need to set at least following 2 properties along with JAVA_HOME:

  1. fs.default.name in core-site.xml.

  2. mapred.job.tracker in mapred-site.xml.

You could have multiple datanodes and tasktrackers, but that doesn't make much sense on a single machine.

HTH




回答2:


A single node setup is one where you have (presumably) one datanode and one tasktracker on a single machine.

A pseudo-distributed setup is where you have multiple datanodes and (presumably) tasktrackers on a single machine. So you have multiple instances of a datanode service running on a single machine to emulate a multi-node cluster.



来源:https://stackoverflow.com/questions/23435333/what-is-the-difference-between-single-node-pseudo-distributed-mode-in-hadoop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!