hiveql | 易学教程

Hadoop/Hive Collect_list without repeating items

阅读更多关于 Hadoop/Hive Collect_list without repeating items

问题 Based on the post, Hive 0.12 - Collect_list, I am trying to locate Java code to implement a UDAF that will accomplish this or similar functionality but without a repeating sequence. For instance, collect_all() returns a sequence A, A, A, B, B, A, C, C I would like to have the sequence A, B, A, C returned. Sequentially repeated items would be removed. Does anyone know of a function in Hive 0.12 that will accomplish or has written their own UDAF? As always, thanks for the help. 回答1: I ran into

How hashing works in bucketing for hive?

阅读更多关于 How hashing works in bucketing for hive?

问题 I know the hashing principal for HashMap in Java, so wanted to know that how the hashing works for the Hive while we bucketing the data in various bucket. 回答1: I recently had to dig into some Hive source code to figure this out for myself. Here's what I found: For an integer field, the hash is just the integer value. For a string, it uses a similar version of Java's String hashCode. When hashing multiple values, the hash is a similar version of Java’s List hashCode. 回答2: Bucketing is used

Hive - Can one extract common options for reuse in other scripts?

阅读更多关于 Hive - Can one extract common options for reuse in other scripts?

问题 I have two Hive scripts which look like this: Script A: SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=non-strict; SET hive.exec.parallel=true; ... do something ... Script B: SET hive.exec.dynamic.partition=true; SET hive.exec.dynamic.partition.mode=non-strict; SET hive.exec.parallel=true; ... do something else ... The options that we set at the beginning of each script are the same. Is it possible somehow to extract them out to a common place (for example, into a

HiveQL - How to find the column value is numeric or not using any UDF?

阅读更多关于 HiveQL - How to find the column value is numeric or not using any UDF?

问题 Basically i would like to return rows based on one column value . If the column contains non numeric values, then return those rows from a hive table. Any UDF is available in Hive ? 回答1: I believe Hive supports rlike (regular expressions). So, you can do: where col rlike '[^0-9]' This looks for any non-digit character. You can expand this, if your numeric values might have decimal points or commas. 回答2: Use cast(expr as <type>) . A null is returned if the conversion does not succeed. case

How can I convert array to string in hive sql?

阅读更多关于 How can I convert array to string in hive sql?

问题 I want to convert an array to string in hive. I want to collect_set array values to convert to string without [[""]] . select actor, collect_set(date) as grpdate from actor_table group by actor; so that [["2016-07-01", "2016-07-02"]] would become 2016-07-01, 2016-07-02 回答1: Use concat_ws(string delimiter, array<string>) function to concatenate array: select actor, concat_ws(',',collect_set(date)) as grpdate from actor_table group by actor; If the date field is not string, then convert it to

Map type variable in hive

阅读更多关于 Map type variable in hive

问题 I am having trouble trying to define map type in hive. According to Hive Manual there definitely is a map type, unfortunately there aren't any examples on how to use it. :-( Suppose, I have a table (users) with following columns: Name Ph CategoryName This "CategoryName" column has specific set of values. Now I want to create a hashtable that maps CategoryName to CategoryID. I tried doing: set hivevar:nameToID=map('A',1,'B',2); I have 2 questions: When I do set hivevar:${nameToID['A']} I

how to convert date 2017-sep-12 To 2017-09-12 in HIVE

阅读更多关于 how to convert date 2017-sep-12 To 2017-09-12 in HIVE

问题 I am facing one issue in converting the date in hive. I need to convert 2017-sep-12 To 2017-09-12 . How can i achieve this in HIVE 回答1: Use unix_timestamp(string date, string pattern) to convert given date format to seconds passed from 1970-01-01. Then use from_unixtime() to convert to given format: hive> select from_unixtime(unix_timestamp('2017-sep-12' ,'yyyy-MMM-dd'), 'dd-MM-yyyy'); OK 12-09-2017 来源： https://stackoverflow.com/questions/47301455/how-to-convert-date-2017-sep-12-to-2017-09-12

Array Intersection in Spark SQL

阅读更多关于 Array Intersection in Spark SQL

问题 I have a table with a array type column named writer which has the values like array[value1, value2] , array[value2, value3] .... etc. I am doing self join to get results which have common values between arrays. I tried: sqlContext.sql("SELECT R2.writer FROM table R1 JOIN table R2 ON R1.id != R2.id WHERE ARRAY_INTERSECTION(R1.writer, R2.writer)[0] is not null ") And sqlContext.sql("SELECT R2.writer FROM table R1 JOIN table R2 ON R1.id != R2.id WHERE ARRAY_INTERSECT(R1.writer, R2.writer)[0] is

How to update table in Hive 0.13?

阅读更多关于 How to update table in Hive 0.13?

How to update table in Hive 0.13?

阅读更多关于 How to update table in Hive 0.13?