hiveql

SQL query Frequency Distribution matrix for product

你离开我真会死。 提交于 2020-08-04 04:27:06
问题 i want to create a frequency distribution matrix 1.Create a matrix.**Is it possible to get this in separate columns** customer1 p1 p2 p3 customer 2 p2 p3 customer 3 p2 p3 p1 customer 4 p2 p1 2. Then I have to count the number of products that come together the most For eg p2 and p3 comes together 3 times p1 p3 comes 2 times p1 p2 comes 2 times I want to recommend products to customers ,frequency of products that comes together select customerId,product,count(*) from sales group by customerId

Convert Ascii value to Character in hive

北城以北 提交于 2020-08-02 08:11:29
问题 I want to convert ascii value to its character in hive.Is there any existing function in a hive (like we have char function in SQL server).Does anyone knows how to achieve this in a hive? For Example: For 65 , result would be A. Thanks in advance. 回答1: This is possible by combining a few of the built in functions: Select decode(unhex(hex(65)), 'US-ASCII'); hex changes the int value to a Hexadecimal string, while unhex changes this to binary. then decode interprets the binary as ASCII data. 来源

How to keep Column Names in camel case in hive

无人久伴 提交于 2020-07-09 05:02:20
问题 select '12345' as `EmpId'; -- output is empid with value 12345 Any leads to keep the same columnname as EmpId? 回答1: Not possible. This is a limitation of the HIVE metastore. It stores the schema of a table in all lowercase. Hive uses this method to normalize column names, see Table.java private static String normalize(String colName) throws HiveException { if (!MetaStoreServerUtils.validateColumnName(colName)) { throw new HiveException("Invalid column name '" + colName + "' in the table

Hive query: select a column based on the condition another columns values match some specific values, then create the match result as a new column

不问归期 提交于 2020-06-27 18:37:06
问题 I have to some query and creat columns operations in HiveQL. For example, app col1 app1 anybody love me? app2 I hate u app3 this hat is good app4 I don't like this one app5 oh my god app6 damn you. app7 such nice girl app8 xxxxx app9 pretty prefect app10 don't love me. app11 xxx anybody? I want to match a keyword list like ['anybody', 'love', 'you', 'xxx', 'don't'] and select the matched keyword result as a new column, named keyword as follows: app keyword app1 anybody, love app4 I don't like

Concatenate multiple columns into one in hive

丶灬走出姿态 提交于 2020-06-24 13:50:33
问题 I need to concatenate column values into a single column. I have column names in a variable as colnames=col1,col2,col3 . I am writing the below query from a unix shell and calling the hive. But when I do this, I am getting only the column names concatenated not the values of those columns. select concat('regexp_replace("${colnames}",",","^")) as result from table; I would like the output as: ABCD^10^XYZ ( ABCD , 10 , XYZ are the column values) 回答1: You need concat_ws function to concatenate

Hive explain plan understanding

丶灬走出姿态 提交于 2020-06-21 10:31:13
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Hive explain plan understanding

时间秒杀一切 提交于 2020-06-21 10:30:09
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Hive explain plan understanding

人走茶凉 提交于 2020-06-21 10:30:00
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Calculate number of days excluding sunday in Hive

元气小坏坏 提交于 2020-06-12 05:40:26
问题 I have two timestamps as input. I want to calculate the time difference in hours between those timestamps excluding Sundays. I can get the number of days using datediff function in hive. I can get the day of a particular date using from_unixtime(unix_timestamp(startdate), 'EEEE'). But I dont know how to relate those functions to achieve my requirement or is there any other easy way to achieve this. Thanks in Advance. 回答1: You can write one custom UDF which takes two columns containing the

Insert overwrite on partitioned table is not deleting the existing data

被刻印的时光 ゝ 提交于 2020-06-08 20:01:28
问题 I am trying to run insert overwrite over a partitioned table. The select query of insert overwrite omits one partition completely. Is it the expected behavior? Table definition CREATE TABLE `cities_red`( `cityid` int, `city` string) PARTITIONED BY ( `state` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 'auto.purge'='true