hiveql

hive using serdeproperties gives error

隐身守侯 提交于 2020-01-06 06:07:32
问题 I am trying to create the hive table so that the hdfs file system have UTF-8 Format, the problem is the query is giving error, not sure what I am doing wrong. DROP TABLE IF EXISTS output_2057565014; CREATE TABLE temp.output_2057565014 ROW FORMAT DELIMITED FIELDS TERMINATED BY 'ธ' COLLECTION ITEMS TERMINATED BY '|' MAP KEYS TERMINATED BY '$' with serdeproperties('serialization.encoding'='UTF-8') LOCATION '/tmp/test-2057565014' AS SELECT * from temp.abc 回答1: "the query is giving error" > yeah,

Query metadata from HIVE using MySQL as metastore

我与影子孤独终老i 提交于 2020-01-05 07:37:12
问题 I am looking for a way to query the metadata of my HIVE data with a HiveQL command. I configured a MySQL metastore, but it is necessary to query the metadata via HIVE command because then I want to access the data with ODBC connection to the HIVE system. Thanks in advanced. 回答1: You can do it currently using Hive JDBC StorageHandler: https://github.com/qubole/Hive-JDBC-Storage-Handler Example of table creation from their page: DROP TABLE HiveTable; CREATE EXTERNAL TABLE HiveTable( id INT, id

How to create a unix script to loop a Hive SELECT query by taking table names as input from a file?

与世无争的帅哥 提交于 2020-01-05 04:07:07
问题 It's pretty straightforward what I'm trying to do. I just need to count the records in multiple Hive tables. I want to create a very simple hql script that takes a file.txt with table names as input and count the total number of records in each of them: SELECT COUNT(*) from <tablename> Output should be like: table1 count1 table2 count2 table3 count3 I'm new to Hive and not very well versed in Unix scripting, and I'm unable to figure out how to create a script to perform this. Can someone

Hive Tez reducers are running super slow

风格不统一 提交于 2020-01-04 02:29:13
问题 I have joined multiple tables and the total no of rows are around 25 billion. On top of that, I am doing aggregation. Here are my hive settings as below, which I am using to generate the final output. I am not really sure how to tune the query and make it run faster. Currently, I am doing trial and error and see if that can produce some results but that doesn't seem to be working.Mappers are running faster but reducers are taking forever to finish off. Could anyone share your thoughts on this

Translate code string into desc in hive

六眼飞鱼酱① 提交于 2020-01-04 01:48:06
问题 Here we have a hyphened string like 0-1-3 .... and the length is not fixed, also we have a DETAIL table in hive to explain the meaning of each code. DETAIL | code | desc | + ---- + ---- + | 0 | AAA | | 1 | BBB | | 2 | CCC | | 3 | DDD | Now we need a hive query to convert the code string into a description string. For example: the case 0-1-3 should get a string like AAA-BBB-DDD . any advice on how to get that ? 回答1: Split your string to get an array, explode array and join with detail table

CASE statements in Hive

点点圈 提交于 2020-01-03 17:31:47
问题 Ok, i have a following code to mark records that have highest month_cd in tabl with binary flag: Select t1.month_cd, t2.max_month_cd ,CASE WHEN t2.max_month_cd != null then 0 else 1 end test_1 ,CASE WHEN t2.max_month_cd = null then 0 else 1 end test_2 from source t1 Left join ( Select MAX(month_cd) as max_month_cd From source ) t2 on t1.month_cd = t2.max_month_cd; It seems straight forward to me, but result it return is: month_cd max_month_cd test_1 test_2 201610 null 1 1 201611 201611 1 1

HIVE Query for Array Sum

房东的猫 提交于 2020-01-03 06:04:55
问题 I have a query as below. Select split(Salary, '\|') as salaryEmp from tableA and it works fine and gives me a an array string as ["1089","1078"] . I would want to add the values of this array string. I am not able to type cast it to integer and sum them. Can a suitable way be suggested for this. 回答1: select sum(e.col) as sum_Salary from salaryEmp lateral view explode (split(Salary,'\\|')) e +------------+ | sum_salary | +------------+ | 2167 | +------------+ 回答2: Use explode() + lateral view

Date Difference less than 15 minutes in Hive

左心房为你撑大大i 提交于 2020-01-03 03:10:13
问题 Below is my query, in which in the last line I am trying to see if the difference between the dates is within 15 minutes. But whenever I run the below query. SELECT TT.BUYER_ID , COUNT(*) FROM (SELECT testingtable1.buyer_id, testingtable1.item_id, testingtable1.created_time from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table as prod_and_ts where to_date(from_unixtime(cast(prod

load struct or any other complex data type in hive

蓝咒 提交于 2020-01-03 02:56:07
问题 I have a .xlsx file which contains data some thing like the below image, am trying to create using the below create query CREATE TABLE aus_aboriginal( code int, area_name string, male_0_4 STRUCT<num:double, total:double, perc:double>, male_5_9 STRUCT<num:double, total:double, perc:double>, male_10_14 STRUCT<num:double, total:double, perc:double>, male_15_19 STRUCT<num:double, total:double, perc:double>, male_20_24 STRUCT<num:double, total:double, perc:double>, male_25_29 STRUCT<num:double,

How to subtract months from date in HIVE

不羁岁月 提交于 2020-01-01 06:36:30
问题 I am looking for a method that helps me subtract months from a date in HIVE I have a date 2015-02-01 . Now i need to subtract 2 months from this date so that result should be 2014-12-01 . Can you guys help me out here? 回答1: select add_months('2015-02-01',-2); if you need to go back to first day of the resulting month: select add_months(trunc('2015-02-01','MM'),-2); 回答2: Please try add_months date function and pass -2 as months. Internally add_months uses Java Calendar.add method, which