Hive

Hive Buckets-understanding TABLESAMPLE(BUCKET X OUT OF Y)

元气小坏坏 提交于 2020-06-25 10:28:28
问题 Hi i am very much new to hive,i have gone through buckets concept in hadoop in action,but failed to understand the below lines.can any one help me on this? SELECT avg(viewTime) FROM page_view TABLESAMPLE(BUCKET 1 OUT OF 32); The general syntax for TABLESAMPLE is TABLESAMPLE(BUCKET x OUT OF y) The sample size for the query is around 1/y. In addition, y needs to be a multiple or factor of the number of buckets specified for the table at table creation time. For example, if we change y to 16,

Hive : Exception.. class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLClassLoader

僤鯓⒐⒋嵵緔 提交于 2020-06-24 14:13:31
问题 When I run the command hive , I get following error. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache

Concatenate multiple columns into one in hive

丶灬走出姿态 提交于 2020-06-24 13:50:33
问题 I need to concatenate column values into a single column. I have column names in a variable as colnames=col1,col2,col3 . I am writing the below query from a unix shell and calling the hive. But when I do this, I am getting only the column names concatenated not the values of those columns. select concat('regexp_replace("${colnames}",",","^")) as result from table; I would like the output as: ABCD^10^XYZ ( ABCD , 10 , XYZ are the column values) 回答1: You need concat_ws function to concatenate

Hive explain plan understanding

丶灬走出姿态 提交于 2020-06-21 10:31:13
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Hive explain plan understanding

时间秒杀一切 提交于 2020-06-21 10:30:09
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Hive explain plan understanding

人走茶凉 提交于 2020-06-21 10:30:00
问题 Is there any proper resource from where we can understand explain plan generated by hive completely? I have tried searching it in the wiki but could not find a complete guide to understand it. Here is the wiki which briefly explains how explain plan works. But I need further information on how to infer the explain plan. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain 回答1: I will try to explain a litte what I know. The execution plan is a description of the tasks

Does hive have a data dictionary?

淺唱寂寞╮ 提交于 2020-06-17 02:03:10
问题 Does hive have a data dictionary? I am trying to fetch the column names of tables in hive. Similar to oracle query other than describe command: SELECT COLUMN_NAME,DATA_TYPE FROM USER_TAB_COLUMNS WHERE TABLE_NAME = ? ORDER BY COLUMN_ID; 回答1: Hive uses an external relational database as its metastore. You can query the configured metastore directly, using the metastore API (eg. MySQL). A higher level component is HCatalog, which offers an API to access and manipulate the metastore. 回答2: Hive

Does hive have a data dictionary?

非 Y 不嫁゛ 提交于 2020-06-17 02:01:34
问题 Does hive have a data dictionary? I am trying to fetch the column names of tables in hive. Similar to oracle query other than describe command: SELECT COLUMN_NAME,DATA_TYPE FROM USER_TAB_COLUMNS WHERE TABLE_NAME = ? ORDER BY COLUMN_ID; 回答1: Hive uses an external relational database as its metastore. You can query the configured metastore directly, using the metastore API (eg. MySQL). A higher level component is HCatalog, which offers an API to access and manipulate the metastore. 回答2: Hive

Calculate number of days excluding sunday in Hive

元气小坏坏 提交于 2020-06-12 05:40:26
问题 I have two timestamps as input. I want to calculate the time difference in hours between those timestamps excluding Sundays. I can get the number of days using datediff function in hive. I can get the day of a particular date using from_unixtime(unix_timestamp(startdate), 'EEEE'). But I dont know how to relate those functions to achieve my requirement or is there any other easy way to achieve this. Thanks in Advance. 回答1: You can write one custom UDF which takes two columns containing the

how to run hive in debug mode

末鹿安然 提交于 2020-06-09 12:14:16
问题 i took example from cloudera website to write a custom SerDe for parsing a file http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/ it seems a good example but when i create table with custom serde ADD JAR <path-to-hive-serdes-jar>; CREATE EXTERNAL TABLE tweets ( id BIGINT, created_at STRING, source STRING, favorited BOOLEAN, retweeted_status STRUCT< text:STRING, user:STRUCT<screen_name:STRING,name:STRING>, retweet_count:INT>, entities STRUCT< urls:ARRAY<STRUCT<expanded