hiveql

SMB join not working over Hive Tables

风格不统一 提交于 2019-12-25 07:14:55
问题 While performing SMB join over two ORC tables, bucketed and sorted on subscription_id, the join fails giving below error: Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:210) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred

Removing special characters using Hive

人盡茶涼 提交于 2019-12-25 03:58:27
问题 I have data stored in Cassandra 1.2 as shown below. There is special character under sValue - highlighted as bold. How can I use hive function to remove this ? Date | Timestam | payload_Timestamp | actDate | actHour | actMinute | sDesc | sName | sValue ---------------------------------+--------------------------------------+--------------------------+----------------------+----------------------+------------------------+---------------------------+--------------------------------+------------

How to find the sum of value based on Adjustments in Impala query

允我心安 提交于 2019-12-25 03:28:18
问题 I have an Impala table named REV having wire_code, amount and Reporting line for each wire code. +---------+------+----------------+ |wire_code| amt | Reporting_line | +---------+------+----------------+ | abc | 100 | Database | +---------+------+----------------+ | abc | 10 | Revenue | +---------+------+----------------+ | def | 50 | Database | +---------+------+----------------+ | def | 25 | Polland | +---------+------+----------------+ | ghi | 250 | Cost | +---------+------+---------------

Date variable in Hive

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-25 03:13:09
问题 I am using following code to set date in Hive SET DATE_DM2=date_sub(from_unixtime(unix_timestamp(),'yyyy/MM/dd'), cast(((from_unixtime(unix_timestamp(), 'u') % 7)+1) as int)); But When I am running the following select statement I am not getting the output select * from TableName where partitiondate='${DATE_DM2}'; Is there anything wrong with the syntax ? 回答1: Correct Syntax is : select * from TableName where partitiondate='${hiveconf:DATE_DM2}'; 来源: https://stackoverflow.com/questions

Creating a hive table with ~40K columns

∥☆過路亽.° 提交于 2019-12-25 02:44:46
问题 I'm trying to create a fairly large table. ~3 millions rows and ~40K columns using hive. To begin, I'm creating an empty table and inserting the data into the table. However, I hit an error when trying this. Unable to acquire IMPLICIT, SHARED lock default after 100 attempts. FAILED: Error in acquiring locks: Locks on the underlying objects cannot be acquire. retry after some time The query is pretty straightforward: create external table database.dataset ( var1 decimal(10,2), var2 decimal(10

How to compute the intersections and unions of two arrays in Hive?

社会主义新天地 提交于 2019-12-24 16:52:25
问题 For example, the intersection select intersect(array("A","B"), array("B","C")) should return ["B"] and the union select union(array("A","B"), array("B","C")) should return ["A","B","C"] What's the best way to make this in Hive? I have checked the hive documentation, but cannot find any relevant information to do this. 回答1: Your problem solution is here. Go to the githubLink, there is lot of udfs are created by klout . Download, crate the JAR and add the JAR in the hive. Example CREATE

Hive - multiple (average) count distincts over layered groups

穿精又带淫゛_ 提交于 2019-12-24 10:45:59
问题 Given the following source data (say the table name is user_activity ): +---------+-----------+------------+ | user_id | user_type | some_date | +---------+-----------+------------+ | 1 | a | 2018-01-01 | | 1 | a | 2018-01-02 | | 2 | a | 2018-01-01 | | 3 | a | 2018-01-01 | | 4 | b | 2018-01-01 | | 4 | b | 2018-01-02 | | 5 | b | 2018-01-02 | +---------+-----------+------------+ I'd like to get the following result: +-----------+------------+---------------------+ | user_type | user_count |

convert normal column as partition column in hive

北慕城南 提交于 2019-12-24 10:34:09
问题 I have a table with 3 columns. now i need to modify one of the column as a partition column. Is there any possibility? If not, how can we add partition to existing table. I used the below syntax: create table t1 (eno int, ename string ) row format delimited fields terminated by '\t'; load data local '/....path/' into table t1; alter table t1 add partition (p1='india'); i am getting errors......... Any one know how to add partition to existing table ......? Thanks in advance. 回答1: I don't

How can I export view data in hive?

巧了我就是萌 提交于 2019-12-24 10:25:15
问题 I have created 4 tables (a,b,c,d) in hive and created a view (x) on top of that tables by joining them. -- How can i export the x underlying csv data from hdfs to local ? -- How can i keep this csv in hdfs for tables , we can do show create table a ; this will show the location of the hdfs where the underlying csv is stored. hadoop fs get --from source_path_and_file --to dest_path_and_file similarly how can i get the csv data from view into my local. 回答1: You can export view data to the CSV

How to check whether a partition exists with hive

拜拜、爱过 提交于 2019-12-24 09:05:57
问题 I have a HiveQL script that can do some operations based on a hive table. But before doing these operations, I will check whether the partition needed exists, and if not, I will terminate the script. So how can I achieve it? 回答1: Using shell: table_name="schema.table" partition_spec="key=value" partition_exists=$(hive -e "show partitions $table_name" | grep "$partition_spec"); #check partition_exists if [ "$partition_exists" = "" ]; then echo not exists; else echo exists; fi 来源: https:/