In a hadoop cluster, should hive be installed on all nodes?

こ雲淡風輕ζ 提交于 2019-12-03 05:31:49

No, it is not something you install on worker nodes. Hive is a Hadoop client. Just run Hive according to the instructions you see at the Hive site.

Hive is basically used for processing structured and semi-structured data in Hadoop. We can also perform Analysis of large datasets which is present in HDFS and also in Amazon S3 filesystem using Hive. In order to query data hive also provides query language known as HiveQL which is similar to SQL. Using Hive one can easily run Ad-hoc queries for the data analysis. Using Hive we don’t need to write complex Map-Reduce jobs, we just need to submit SQL queries. Hive converts these SQL queries into MapReduce jobs.

Finally Hive SQL will get converted to MapReduce jobs and we don't have to submit MapReduce job from all node in a Hadoop cluster, in the same way we don't need Hive to be installed in all node of Hadoop cluster

From Cloudera's Hive installation Guide:

Install Hive on your client machine(s) from which you submit jobs; you do not need to install it on the nodes in your Hadoop cluster.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!