hive-table

Hive - create hive table from specific data of three csv files in hdfs

a 夏天 提交于 2020-04-18 05:48:27
问题 I have three .csv files, each in different hdfs directory. I now want to make a Hive internal table with data from those three files. I want four columns from first file, three columns from second file and two columns from third file. first file share an unique id column with second file and third file share another unique id column with third file. both unique ids are present in second file; using these ids I would like to left-outer-join to make table. file 1: '/directory_1/sub_directory_1

Cannot Create table with spark SQL : Hive support is required to CREATE Hive TABLE (AS SELECT);

霸气de小男生 提交于 2019-12-11 16:40:31
问题 I'm trying to create a table in spark (scala) and then insert values from two existing dataframes but I got this exeption: Exception in thread "main" org.apache.spark.sql.AnalysisException: Hive support is required to CREATE Hive TABLE (AS SELECT);; 'CreateTable `stat_type_predicate_percentage`, ErrorIfExists Here is the code : case class stat_type_predicate_percentage (type1: Option[String], predicate: Option[String], outin: Option[INT], percentage: Option[FLOAT]) object LoadFiles1 { def

Spark DataFrame ORC Hive table reading issue

て烟熏妆下的殇ゞ 提交于 2019-12-09 03:40:30
I am trying to read a Hive table in Spark. Below is the Hive Table format: # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: field.delim \u0001 serialization.format \u0001 When I am trying to read it using the Spark SQL with the below command: val c = hiveContext.sql("""select a from c_db.c cs where dt >= '2016-05-12' """) c. show I am getting the below