Linux中hive的安装部署

匿名 (未验证) 提交于 2019-12-02 21:56:30


hive简介

Hive是一个数据仓库基础工具在Hadoop中用来处理结构化数据。它架构在Hadoop之上,总归为大数据,并使得查询和分析方便。并提供简单的sql查询功能,可以将sql语句转换为MapReduce任务进行运行。

Hive 不是:

  1. 一个关系数据库
  2. 一个设计用于联机事务处理(OLTP)
  3. 实时查询和行级更新的语言

Hive特点:

  1. 建立在Hadoop之上
  2. 处理结构化的数据
  3. 存储依赖于HDSF:hive表中的数据是存储在hdfs之上
  4. SQL语句的执行依赖于MapReduce
  5. hive的功能:让Hadoop实现了SQL的接口,实际就是将SQL语句转化为MapReduce程序
  6. hive的本质就是Hadoop的客户端
  7. hive支持的计算框架有MapReduce、Spark、Tez

hive官网地址

hive官网下载地址:https://mirrors.tuna.tsinghua.edu.cn/apache/hive/

hive配置:https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration

hive官网:https://cwiki.apache.org/confluence/display/Hive/Home#Home-ResourcesforContributors

hive安装配置

配置hive之前先要保证已经安装部署好了Java、Hadoop、MySQL
原因:hive基于Java开发,hive是映射Hadoop中hdfs的文件、引擎位Hadoop的MapReduce,Remote Metastore Server改为MySQL

Centos7的Java、Hadoop、MySQL安装部署前面已经写了,这里就不一一赘述了

CentOS 7 下使用yum安装MySQL5.7.20,并设置开启启动

hadoop在Linux中的安装步骤及开发环境搭建

下载安装包后在Linux上解压安装
先在官网下载hive(我这里下载是hive3.1.1版本)
下载到windows本地后在Linux中通过rz命令将安装包放入Linux

解压安装:tar apache-hive-3.1.1-bin.tar.gz -C /opt/software/

修改文件夹名称:mv apache-hive-3.1.1-bin hive-3.1.1

修改opt/modules/hive-1.2.1/conf/目录下的配置文件
修改hive-env.sh文件(配置Hadoop环境变量和hive配置文件目录)
cd opt/modules/hive-1.2.1/conf/
mv hive-env.sh.template hive-env.sh
vi hive-env.sh

HADOOP_HOME=/opt/software/hadoop-2.6.0-cdh5.7.6 export HIVE_CONF_DIR=/opt/software/hive-3.1.1/conf 

修改hive-log4j2.properties文件(修改日志目录和文件名)
mv hive-log4j2.properties.template hive-log4j2.properties
vi hive-log4j2.properties

property.hive.log.dir = /opt/software/hive-3.1.1/logs property.hive.log.file = hive.log 


配置Hive的Metastore
cd /opt/software/hive-3.1.1/conf
cp hive-default.xml.template hive-site.xml
清空文件内容::>hive-site.xml
vi hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--    this work for additional information regarding copyright ownership.    The ASF licenses this file to You under the Apache License, Version 2.0    (the "License"); you may not use this file except in compliance with    the License.  You may obtain a copy of the License at     Unless required by applicable law or agreed to in writing, software    distributed under the License is distributed on an "AS IS" BASIS,    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.    See the License for the specific language governing permissions and    limitations under the License. --><configuration>   <!-- WARNING!!! This file is auto generated for documentation purposes ONLY! -->   <!-- WARNING!!! Any changes you make to this file will be ignored by Hive.   -->   <!-- WARNING!!! You must make your changes in hive-site.xml instead.         -->   <!-- Hive Execution Parameters -->    <property>     <name>hive.exec.dynamic.partition</name>     <value>false</value>     <description>Whether or not to allow dynamic partitions in DML/DDL.</description>   </property>   <property>     <name>hive.exec.dynamic.partition.mode</name>     <value>strict</value>     <description>How many jobs at most can be executed in parallel</description>   </property>   <property>     <name>hive.metastore.db.type</name>     <value>mysql</value>     <description>       Expects one of [derby, oracle, mysql, mssql, postgres].       Type of database used by the metastore. Information schema &amp; JDBCStorageHandler depend on it.     </description>   </property>   <property> 	<name>javax.jdo.option.ConnectionURL</name> 	<value>jdbc:mysql://bigdata.fuyun:3306/hive_3__metastore?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8</value> 	<description>JDBC connect string for a JDBC metastore</description>   </property>   <property>   	<name>javax.jdo.option.ConnectionDriverName</name>   	<value>com.mysql.jdbc.Driver</value>   	<description>Driver class name for a JDBC metastore</description>   </property>   <property>   	<name>javax.jdo.option.ConnectionUserName</name>   	<value>root</value>   </property>   <property>   	<name>javax.jdo.option.ConnectionPassword</name>   	<value>Mysql@123</value>   </property>   <property>        <name>hive.server2.thrift.port</name>        <value>10000</value>    </property>   <property>        <name>hive.server2.thrift.bind.host</name>        <value>bigdata.fuyun</value>   </property>   <property> 	<name>hive.metastore.uris</name> 	<value>thrift://bigdata.fuyun:9083</value>   </property>   <property> 	<name>hive.metastore.local</name> 	<value>false</value>   </property>   <property> 	<name>hive.metastore.warehouse.dir</name> 	<value>/user/hive/warehouse-3.1.1</value>   </property>   <property> 	<name>datanucleus.schema.autoCreateAll</name> 	<value>true</value>   </property>   <property> 	<name>datanucleus.metadata.validate</name> 	<value>false</value>   </property>   <property> 	<name>hive.metastore.schema.verification</name> 	<value>false</value>   </property>   <property>          <name>hive.execution.engine</name>          <value>mr</value>   </property>   <property>        <name>hive.cli.print.current.db</name>        <value>true</value>        <description>Whether to include the current database in the Hive prompt.</description>   </property>   <property>        <name>hive.cli.print.header</name>        <value>true</value>       <description>Whether to print the names of the columns in query output.</description>   </property> </configuration>  

将MySQL的JDBC驱动包放到hive的lib目录下
我的MySQL是5.7版本的,所以要下载8.0及以上的驱动
rz

cp mysql-connector-java-8.0.13.tar.gz ../software/hive-3.1.1/lib/

设置为全局变量(如果不想设置为全部变量此步骤可以忽略)
sudo vi /etc/profile

#HIVE_HOME export HIVE_HOME=/opt/software/hive-3.1.1 export PATH=$PATH:$HIVE_HOME/bin export CLASSPATH=$CLASSPATH:/opt/software/hadoop-2.6.0-cdh5.7.6/lib/*:. export CLASSPATH=$CLASSPATH:/opt/software/hive-3.1.1/lib/*:. 


刷新环境变量:source /etc/profile

输入命令测试

hive服务启动脚本

cat start-hiveserver2.sh

#!/bin/sh   HIVE_HOME=/opt/software/hive-3.1.1  ## 启动服务的时间 DATE_STR=`/bin/date '+%Y%m%d%H%M%S'`  # 日志文件名称(包含存储路径) HIVE_SERVER2_LOG=${HIVE_HOME}/logs/hiveserver2-${DATE_STR}.log  ## 启动服务 /usr/bin/nohup ${HIVE_HOME}/bin/hiveserver2 > ${HIVE_SERVER2_LOG} 2>&1 & 

cat start-metastore.sh

#!/bin/sh   HIVE_HOME=/opt/software/hive-3.1.1  ## 启动服务的时间 DATE_STR=`/bin/date '+%Y%m%d%H%M%S'`  # 日志文件名称(包含存储路径) HIVE_SERVER2_LOG=${HIVE_HOME}/logs/hivemetastore-${DATE_STR}.log  ## 启动服务 /usr/bin/nohup ${HIVE_HOME}/bin/hive --service metastore > ${HIVE_SERVER2_LOG} 2>&1 & 

后记:中间遇到了很多坑,差点弄了个通宵,也总算解决了,具体问题主要是metestore配置和JDBC的jar包问题。具体怎么解决的,忘了,想起来后再补补记录。

文章来源: https://blog.csdn.net/lz6363/article/details/92269137
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!