csv file to hive table using load data - How to format the date in csv to accept by hive table

孤者浪人 提交于 2019-12-04 17:17:15

LasySimpleSerDe (default) does not work with quoted CSV. Use CSVSerDe:

create table dattable(
DATANUM    INT,  
ENTRYNUM BIGINT, 
START_DATE  DATE,
END_DATE    DATE ) 
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = ",",
   "quoteChar"     = "'"
)  
STORED AS TEXTFILE;

Also read this: CSVSerDe treats all columns to be of type String

Define you date columns as string and apply conversion in select.

Answer to your 2nd question:

You will need an additional temporary table to read your input file, and then you can do date conversions in your insert select statements.In your temporary table store date fields as string. Ex.

create table dattable_ext(
DATANUM    INT,  
ENTRYNUM BIGINT, 
START_DATE  String,
END_DATE    String) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

Load data into temporary table

LOAD DATA LOCAL INPATH '/path/dtatable.csv' OVERWRITE INTO TABLE dattable_ext;

Insert from temporary table to the managed table.

insert into table dattable select DATANUM, ENTRYNUM,
from_unixtime(unix_timestamp(START_DATE,'yyyy/MM/dd'),'yyyy-MM-dd'),
from_unixtime(unix_timestamp(END_DATE,'yyyy/MM/dd'),'yyyy-MM-dd') from dattable_ext;

You can replace date format in unix_timestamp function with your input date format.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!