问题
I have a PIG script that
- Loads and transforms the data from a csv
- Replaces some characters
Calls a java program (JAR) to convert the date-time in csv from 06/02/2015 18:52 to 2015-6-2 18:52 (mm/DD/yyyy to yyyy-MM-dd)
REGISTER /home/cloudera/DateTime.jar;
A = Load '/user/cloudera/Data.csv' using PigStorage(',') as (ac,datetime,amt,trace);
B = FOREACH A GENERATE ac, REPLACE(datetime, '\\/','-') as newdate,REPLACE(amt,'-','') as newamt,trace;
C = FOREACH B GENERATE ac,Converter.DateTime(newdate) as ConvDate,ConvAmt,trace;
Store C into '/user/cloudera/Output/' using PigStorage('\t');
Sample Input -- 21467245 06/02/2015 18:52 -9.59 518
Sample Output -- 21467245 2015-6-2 18:52 9.59 518
I am loading the output into hive, other fields seem fine during import, but the date-time field results null if loaded as timestamp and is intact when its string.
Where is this going wrong?
Am using Cloudera CDH 5
回答1:
From the hive docs:
Timestamps in text files have to use the format yyyy-mm-dd hh:mm:ss[.f...]. If they are in another format declare them as the appropriate type (INT, FLOAT, STRING, etc.) and use a UDF to convert them to timestamps.
So you need to either change your Converter
to output this format, or use a UDF --- or just keep them as strings, which is what I usually do !
来源:https://stackoverflow.com/questions/31051506/hive-not-detecting-timestamp-format