问题
I'm loading a tsv file with a datetime column and long column with:
A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long);
DUMP A;
An example line of input:
Tue Feb 11 05:02:10 +0000 2014 205291417
that line of output:
, 205291417
How do I do this properly?
回答1:
You'd want to load date as a chararray (date:chararray) and then can convert it to to a datetime using FOREACH GENERATE
along with the ToDate Pig built-in function.
The format string is based on the SimpleDateFormat
A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);
B = FOREACH A GENERATE ToDate(date, '<some format string>') AS date, userid;
DUMP B;
来源:https://stackoverflow.com/questions/22052578/properly-loading-datetime-in-pig