hiveql

Month in MM using Month() in Hive

我怕爱的太早我们不能终老 提交于 2020-01-13 14:57:34
问题 Select * from concat(YEAR(DATE_SUB(MAX(Column_name),60),MONTH(DATE_SUB(MAX(Column_name),60),-01) The month() yields only single digit for months until September i.e Jan returns 1 instead of 01 . Need help in handling this. I am using this output to feed to another SELECT query using TO_DATE . 回答1: month() function returns integer, that is why there is no leading zero. You can use lpad(month,2,0) function to format month: hive> select lpad(month('2017-09-01'),2,0); OK 09 Time taken: 0.124

Month in MM using Month() in Hive

寵の児 提交于 2020-01-13 14:56:10
问题 Select * from concat(YEAR(DATE_SUB(MAX(Column_name),60),MONTH(DATE_SUB(MAX(Column_name),60),-01) The month() yields only single digit for months until September i.e Jan returns 1 instead of 01 . Need help in handling this. I am using this output to feed to another SELECT query using TO_DATE . 回答1: month() function returns integer, that is why there is no leading zero. You can use lpad(month,2,0) function to format month: hive> select lpad(month('2017-09-01'),2,0); OK 09 Time taken: 0.124

Select top 2 rows in Hive

╄→尐↘猪︶ㄣ 提交于 2020-01-12 11:58:04
问题 I'm a noobie here. I'm trying to retrieve top 2 tables from my employee list based on salary in hive (version 0.11). Since it doesn't support TOP function, is there any alternatives? Or do we have define a UDF? 回答1: Yes, here you can use LIMIT . You can try it by the below query: SELECT * FROM employee_list SORT BY salary DESC LIMIT 2 回答2: select * from employee_list order by salary desc limit 2; 来源: https://stackoverflow.com/questions/30441744/select-top-2-rows-in-hive

Select top 2 rows in Hive

ε祈祈猫儿з 提交于 2020-01-12 11:56:26
问题 I'm a noobie here. I'm trying to retrieve top 2 tables from my employee list based on salary in hive (version 0.11). Since it doesn't support TOP function, is there any alternatives? Or do we have define a UDF? 回答1: Yes, here you can use LIMIT . You can try it by the below query: SELECT * FROM employee_list SORT BY salary DESC LIMIT 2 回答2: select * from employee_list order by salary desc limit 2; 来源: https://stackoverflow.com/questions/30441744/select-top-2-rows-in-hive

Select top 2 rows in Hive

那年仲夏 提交于 2020-01-12 11:56:13
问题 I'm a noobie here. I'm trying to retrieve top 2 tables from my employee list based on salary in hive (version 0.11). Since it doesn't support TOP function, is there any alternatives? Or do we have define a UDF? 回答1: Yes, here you can use LIMIT . You can try it by the below query: SELECT * FROM employee_list SORT BY salary DESC LIMIT 2 回答2: select * from employee_list order by salary desc limit 2; 来源: https://stackoverflow.com/questions/30441744/select-top-2-rows-in-hive

How do the hive sql queries are submitted as mr job from hive cli

时光总嘲笑我的痴心妄想 提交于 2020-01-11 09:41:32
问题 I have deployed a CDH-5.9 cluster with MR as hive execution engine. I have a hive table named "users" with 50 rows. Whenever I execute the query select * from users works fine as follows : hive> select * from users; OK Adam 1 38 ATK093 CHEF Benjamin 2 24 ATK032 SERVANT Charles 3 45 ATK107 CASHIER Ivy 4 30 ATK384 SERVANT Linda 5 23 ATK132 ASSISTANT . . . Time taken: 0.059 seconds, Fetched: 50 row(s) But issuing select max(age) from users failed after submitting as mr job. The container log

How do you get 'event date > current date - 10 days) in HiveQL?

情到浓时终转凉″ 提交于 2020-01-07 09:03:15
问题 I'm putting together a query that will get refreshed daily that needs to pull records from the last ten dates. The tables I'm accessing have a 'xxdatetime' column with the unix time stamp and an 'eventdate' column with the date in a yyyy-mm-dd. In Impala, the answer was easy: where eventdate > to_date(days_sub(now(), 10)) I used a variation of it in Hive that failed because I guess it was scanning the whole table and the tables are MASSIVE: where datediff(cast(current_timestamp() as string),

hive 1.2.1 error on delete command

一曲冷凌霜 提交于 2020-01-06 20:13:23
问题 I use apache hive 1.2.1 . hiveserver2's metastore is in embedded mode. in hive-default.xml file I have following properties: <property> <name>hive.support.concurrency</name> <value>true</value> <description> </description> </property> <property> <name>hive.enforce.bucketing</name> <value>true</value> <description></description> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>nonstrict</value> <description> </description> </property> <property> <name>hive.txn

HiveSQL access JSON-array values

痞子三分冷 提交于 2020-01-06 09:12:25
问题 I have a table in Hive, which is generated by reading from a Sequence File in my HDFS. Those sequence files are json and look like this: {"Activity":"Started","CustomerName":"CustomerName3","DeviceID":"StationRoboter","OrderID":"CustomerOrderID3","DateTime":"2018-11-27T12:56:47Z+0100","Color":[{"Name":"red","Amount":1},{"Name":"green","Amount":1},{"Name":"blue","Amount":1}],"BrickTotalAmount":3} They submit product part colours and the amount of them which are counted in one service process

TBLPROPERTIES('skip.header.line.count'='1') is not working on sparkThrift connected from beeline with hive jdbc 1.2.1

谁说胖子不能爱 提交于 2020-01-06 06:46:21
问题 I am using spark 2.3 and connecting sparkThrift with beeline. Hive jdbc version 1.2.1 Spark SQL version 2.3.1 I am trying to create external table with skip header property but select command is always returning data with header as first row, below is my create query CREATE EXTERNAL TABLE datasourcename11( `retail_invoice_detail_sys_invoice_no` STRING, `store_id` STRING, `retail_invoice_detail_invoice_time` STRING, `retail_invoice_detail_invoice_date` string, `cust_id` STRING, `article_code`