Hadoop gen1 vs Hadoop gen2

前端 未结 9 1377
南笙
南笙 2021-02-10 09:22

I am a bit confused about place of tasktracker in Hadoop-2.x.

Daemons in Hadoop-1.x are namenode, datanode, jobtracker, taskracker and secondaryna

9条回答
  •  天命终不由人
    2021-02-10 10:16

    Just Remember the below comparisons Job Tracker = Resource Manager (Application manager, known as container 0) + scheduler (FIFO,fair scheduler and capacity scheduler)

    Tasktracker = Node manager

    Initially when job is submitted in HDPv1 1. The job tracker had the responsibility of calculating the mappers and reducers for job, monitoring dead/live task-trackers, re-spawning mappers and reducers if they fail.

    Now in HDPv2 when we submit a job the

    Resource manager java process (The same java process act as scheduler) first spawns application manager on any node (also known as container 0), then application manager reads the job code and calculates the resources required by that job and asks for resources from scheduler (which also monitor how many resources does job's queue has). Scheduler calculated and gives names of nodes to AM where it can spawn containers. Then AM spawns containers on those nodes and monitors them . In case any container dies it is the AM which again goes to scheduler and negotiates for more resource. Hence the work of jobtracker is divided between AM and scheduler of YARN. Also please note that each job submitted will have a new AM so there can be multiple AM running but only one scheduler on cluster. The AM is spawned on node managers and scheduler is started on RM node.

提交回复
热议问题