properly loading datetime in pig

时光怂恿深爱的人放手 提交于 2019-12-10 14:53:16

问题


I'm loading a tsv file with a datetime column and long column with:

A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:datetime, userid:long);
DUMP A;

An example line of input:

Tue Feb 11 05:02:10 +0000 2014  205291417

that line of output:

, 205291417

How do I do this properly?


回答1:


You'd want to load date as a chararray (date:chararray) and then can convert it to to a datetime using FOREACH GENERATE along with the ToDate Pig built-in function.

The format string is based on the SimpleDateFormat

A = LOAD 'tweets-clean.txt' USING PigStorage('\t') AS (date:chararray, userid:long);
B = FOREACH A GENERATE ToDate(date, '<some format string>') AS date, userid;
DUMP B;


来源:https://stackoverflow.com/questions/22052578/properly-loading-datetime-in-pig

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!