hiveql

Hive: More clean way to SELECT AS and GROUP BY

南笙酒味 提交于 2020-05-25 06:46:25
问题 I try to write Hive Sql like that SELECT count(1), substr(date, 1, 4) as year FROM *** GROUP BY year But Hive cannot recognize the alias name 'year', it complains that: FAILED: SemanticException [Error 10004]: Line 1:79 Invalid table alias or column reference 'year' One solution(Hive: SELECT AS and GROUP BY) suggest to use 'GROUP BY substr(date, 1, 4)'. It works! However in some cases the value I want to group by may be generated from multiple lines of hive function code , it's very ugly to

Hive: More clean way to SELECT AS and GROUP BY

妖精的绣舞 提交于 2020-05-25 06:44:19
问题 I try to write Hive Sql like that SELECT count(1), substr(date, 1, 4) as year FROM *** GROUP BY year But Hive cannot recognize the alias name 'year', it complains that: FAILED: SemanticException [Error 10004]: Line 1:79 Invalid table alias or column reference 'year' One solution(Hive: SELECT AS and GROUP BY) suggest to use 'GROUP BY substr(date, 1, 4)'. It works! However in some cases the value I want to group by may be generated from multiple lines of hive function code , it's very ugly to

Assign same value when using lag function if column used in lag has same value

岁酱吖の 提交于 2020-05-24 03:57:27
问题 I have a table in sql contents are below +---+----------+----------+----------+--------+ | pk| from_d| to_d| load_date| row_num| +---+----------+----------+----------+--------+ |111|2019-03-03|2019-03-03|2019-03-03| 1| |111|2019-02-02|2019-02-02|2019-02-02| 2| |111|2019-02-02|2019-02-02|2019-02-02| 2| |111|2019-01-01|2019-01-01|2019-01-01| 3| |222|2019-03-03|2019-03-03|2019-03-03| 1| |222|2019-01-01|2019-01-01|2019-01-01| 2| |333|2019-02-02|2019-02-02|2019-02-02| 1| |333|2019-01-01|2019-01-01

Hive String to Timestamp conversion with Milliseconds

拥有回忆 提交于 2020-05-13 14:35:10
问题 I have a requirement to convert the mentioned input string format and produce the desired output in timestamp as shown below. Input: 16AUG2001:23:46:32.876086 Desired Output: 2001-08-16 23:46:32.876086 Output which is coming by running the below code: 2001-08-17 00:01:08 Query: select '16AUG2001:23:46:32.876086' as row_ins_timestamp, from_unixtime(unix_timestamp('16AUG2001:23:46:32.876086', 'ddMMMyyyy:HH:mm:ss.SSSSSS')) as row_ins_timestamp from temp; Milliseconds part is not getting

To schedule a hive query on Crontab

为君一笑 提交于 2020-04-30 08:40:09
问题 Can any one help me to schedule a job in Crontab which will execute a simple Hive query on specific time and provide me the output in text/log file. I have created a batch script to execute a select query , but getting error("Hive command not found") while executing it in Crontab. However same script is running fine through shell. Below is my script : ip.sh #!/bin/bash echo "Starting of Job" cd /home/hadoop/work/hive/bin hive -e 'select * from mytest.empl' echo "Script ends here" Crontab: 10

Hive - create hive table from specific data of three csv files in hdfs

a 夏天 提交于 2020-04-18 05:48:27
问题 I have three .csv files, each in different hdfs directory. I now want to make a Hive internal table with data from those three files. I want four columns from first file, three columns from second file and two columns from third file. first file share an unique id column with second file and third file share another unique id column with third file. both unique ids are present in second file; using these ids I would like to left-outer-join to make table. file 1: '/directory_1/sub_directory_1

Looking to set a reusable variable in hive

倖福魔咒の 提交于 2020-04-18 05:33:52
问题 I'm looking to set a variable like below, called today_date , and then be able to reuse it as a variable throughout the query. The below throws an error. set today_date = date_format(date_sub(current_date, 1), 'YYYYMMdd') select account from table where data_date = today_date 回答1: First command should end with semicolon: set today_date=date_format(date_sub(current_date, 1), 'YYYYMMdd'); And variable should be used like this: select account from table where data_date=${hivevar:today_date}; set

Finding the number of users associated with the hive database [closed]

99封情书 提交于 2020-04-07 10:31:10
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 3 days ago . Please let me know the method for finding the number of users assigned to the databases in Hive. 来源: https://stackoverflow.com/questions/60998388/finding-the-number-of-users-associated-with-the-hive-database

Finding the number of users associated with the hive database [closed]

青春壹個敷衍的年華 提交于 2020-04-07 10:30:12
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 3 days ago . Please let me know the method for finding the number of users assigned to the databases in Hive. 来源: https://stackoverflow.com/questions/60998388/finding-the-number-of-users-associated-with-the-hive-database

Finding the number of users associated with the hive database [closed]

≡放荡痞女 提交于 2020-04-07 10:29:23
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 3 days ago . Please let me know the method for finding the number of users assigned to the databases in Hive. 来源: https://stackoverflow.com/questions/60998388/finding-the-number-of-users-associated-with-the-hive-database