Which hadoop version should I choose among 1.x, 2.2 and 0.23

元气小坏坏 提交于 2019-11-30 10:36:25

There are a few active release series. The 1.x release series is a continuation of the 0.20 release series. A few weeks after 0.23 released, the 0.20 branch formerly known as 0.20.205 was renumbered 1.0. There is next to no functional difference between 0.20.205 and 1.0. This is just a renumbering.

The 0.23 includes several major new features includes a new MapReduce runtime, called MapReduce 2, implemented on a new system called YARN (Yet Another Resource Negotiator), which is a general resource management system for running distributed applications. Similarly, 2.x release is a continuation of the 0.23 release series. So the 2.2 also support YARN.

According to Hadoop 2.2 release note

  • 1.2.X - current stable version, 1.2 release

  • 2.2.X - current stable 2.x version

  • 0.23.X - similar to 2.X.X but missing NN HA.

I would suggest starting with Cloudera distribution since you just start learning. The CDH 4.5 includes the YARN feature you are looking for. You can also try HortonWorks distribution. The advantage of going with these vendors is that you do not need to worry about which version of components such as Hive, Pig to work with your Hadoop installation.

I recommended you to start with hadoop-2.2.0 which gives good knowledge. Industry prefers YARN itself and in production 2.x only exists

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!