beeline

Adding Local Files in Beeline (Hive)

China☆狼群 提交于 2021-02-10 17:42:56
问题 I'm trying to add local files via the Beeline client, however I keep running into an issue where it tells me the file does not exist. [test@test-001 tmp]$ touch /tmp/m.py [test@test-001 tmp]$ stat /tmp/m.py File: ‘/tmp/m.py’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: 801h/2049d Inode: 34091464 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 1036/ test) Gid: ( 1037/ test) Context: unconfined_u:object_r:user_tmp_t:s0 Access: 2017-02-27 22:04:06.527970709 +0000 Modify: 2017-02-27 22

Optimize Hive Query. java.lang.OutOfMemoryError: Java heap space/GC overhead limit exceeded

我怕爱的太早我们不能终老 提交于 2021-01-28 14:18:32
问题 How can I optimize a query of this form since I keep running into this OOM error? Or come up with a better execution plan? If I removed the substring clause, the query would work fine, suggesting that this takes a lot of memory. When the job fails, the beeline output shows the OOM Java heap space. Readings online suggested that I increase export HADOOP_HEAPSIZE but this still results in the error. Another thing I tried was increasing the hive.tez.container.size and hive.tez.java.opts (tez

“Beeline command not found” error while executing beeline command from python script (called from oozie shell action)

爷,独闯天下 提交于 2020-01-25 09:50:05
问题 I have a python script that I want to schedule using oozie. I am using Oozie shell action for invoking the script. There is a beeline command in the script. When I run the oozie workflow, I get error "sh: beeline: command not found" . If I run this script or just the beeline command manually from edge node, it runs perfectly fine. My data platform is Hortonworks 2.6. Below is my workflow.xml and python script: Workflow.xml <workflow-app xmlns="uri:oozie:workflow:0.3" name="hive2-wf">

What is the hive command to see the value of hive.exec.dynamic.partition

▼魔方 西西 提交于 2019-12-21 12:41:42
问题 We know that set command is used to set some values for properties hive> SET hive.exec.dynamic.partition=true; hive> SET hive.exec.dynamic.partition.mode=non-strict; But how do we read the current value of above property I tried the below commands, its not working get hive.exec.dynamic.partition show hive.exec.dynamic.partition Could someone help on getting the correct hive command to read the current value of those above properties . 回答1: The same SET command but without value assignment

AuthorizationException: User not allowed to impersonate User

≯℡__Kan透↙ 提交于 2019-12-13 11:58:27
问题 I wrote a spark job which registers a temp table and when I expose it via beeline (JDBC client) $ ./bin/beeline beeline> !connect jdbc:hive2://IP:10003 -n ram -p xxxx 0: jdbc:hive2://IP> show tables; +---------------------------------------------+--------------+---------------------+ | tableName | isTemporary | +---------------------------------------------+--------------+---------------------+ | f238 | true | +---------------------------------------------+--------------+---------------------

Hive/Beeline, how can I set the job .staging directory?

喜欢而已 提交于 2019-12-12 02:34:34
问题 On the cluster I'm working on every user is given 60GB of Hadoop quota. Historically the project I'm working on generates a lot of Hive queries. In order for things to work faster I'm trying to parallel these queries (which are unrelated) but as a result the directory /user/{myusername}/.staging/ is being filled with job_{someid} directories which in turn are filled with the hive jars and consume these 60GB very fast. While I can limit the parallelization factor I would also like to see if I

Logs for hive query executed via. beeline

浪子不回头ぞ 提交于 2019-12-08 04:28:24
问题 i am running below hive coomand from beeline . Can someone please tell where can I see Map reudce logs for this ? 0: jdbc:hive2://<servername>:10003/> select a.offr_id offerID , a.offr_nm offerNm , b.disp_strt_ts dispStartDt , b.disp_end_ts dispEndDt , vld_strt_ts validStartDt, vld_end_ts validEndDt from gcor_offr a, gcor_offr_dur b where a.offr_id = b.offr_id and b.disp_end_ts > '2016-09-13 00:00:00'; 回答1: When using beeline, MapReduce logs are part of HiveServer2 log4j logs. If your Hive

【干货】Apache Hive 2.1.1 安装配置超详细过程,配置hive、beeline、hwi、HCatalog、WebHCat等组件

拥有回忆 提交于 2019-12-05 23:02:36
在Docker环境成功搭建了Apache Hadoop 2.8 分布式集群,并实现了NameNode HA、ResourceManager HA之后(详见我的另一篇博文: Apache Hadoop 2.8分布式集群详细搭建过程 ),接下来将搭建最新稳定版的Apache Hive 2.1.1,方便日常在自己电脑上测试hive配置和作业,同样的配置也可以应用于服务器上。以下是Apache Hive 2.1.1的安装配置详细过程 1、阅读Apache Hive官网说明文档,下载最新版本Hive Hive是一个基于Hadoop的数据仓库工具,将HDFS中的结构化数据映射为数据表,并实现将类SQL脚本转换为MapReduce作业,从而实现用户只需像传统关系型数据库提供SQL语句,并能实现对Hadoop数据的分析和处理,门槛低,非常适合传统的基于关系型数据库的数据分析向基于Hadoop的分析进行转变。因此,Hive是Hadoop生态圈非常重要的一个工具。 安装配置Apache Hive,最直接的方式,便是阅读 Apache Hive官网的说明文档 ,能了解到很多有用的信息。Apache Hive 要求JDK 1.7及以上,Hadoop 2.x(从Hive 2.0.0开始便不再支持Hadoop 1.x),Hive 可部署于Linux、Mac、Windows环境。 从官网下载最新稳定版本的

How to access Metastore from beeline?

左心房为你撑大大i 提交于 2019-12-02 11:38:58
问题 I need to do some SQL queries (as here) directly from Metasore. PS: the commands SHOW/DESCRIBE are not enough. How to enable access from it as database, or what the database name of Metastore? ... In nowadays (2019) it is possible? NOTES What is Metastore ? For me is a very important element of the Hive architecture, final user need some access to it... "All Hive implementation need a metastore service, where it stores metadata. It is implemented using tables in relational database. By

Connecting to Hive using Beeline

旧时模样 提交于 2019-11-30 05:11:39
I am trying to connect to hive installed in my machine through Beeline client. when I give the 'beeline' command & connect to Hive, the client is asking for user name & password !connect jdbc:hive2://localhost:10000/default I have no idea what is the user name and password I am supposed to give. Do I have to add the credential(user name & password) in some configuration file? Sravan K Reddy no username and no password. !connect jdbc:hive2://localhost:10000/default Enter username for jdbc:hive2://localhost:10000/default: <press Enter> Enter password for jdbc:hive2://localhost:10000/default: