Finding the difference between start_times and end_times in PIG

怎甘沉沦 提交于 2019-12-23 05:25:31

问题


Could anyone please tell me how to find the difference between two times in PIG...

For e.g., Below are the sample Start_Times and End_Times, I need to find the difference between Start_Time and End_Time in PIG.

12:31:38,14:54:04
10:18:34,13:30:56
13:37:43,15:18:57
08:15:10,11:28:17

Thanks in Advance...


回答1:


Couldn't find a straightforward way. Here is a workaround:

    t = LOAD ' input/data' USING PigStorage(',') as (time1:chararray,time2:chararray);
    u = FOREACH t GENERATE SecondsBetween(ToDate(time2,'HH:mm:ss'),ToDate(time1,'HH:mm:ss')) as seconds;
    v = FOREACH u GENERATE seconds/3600 as hours,(seconds%3600)/60 as minutes,(seconds%3600)%60 as seconds;
    STORE v into 'output/data' USING PigStorage(':');

Output for your sample data with this code:

    2:22:26
    3:12:22
    1:41:14
    3:13:7



回答2:


Use an UDF to convert to UNIX timestamp, there is one for it in piggybank:

DEFINE ISOToUnix org.apache.pig.piggybank.evaluation.datetime.convert.ISOToUnix();

Then, something like:

a = FOREACH Dates GENERATE ISOToUnix(date2) - ISOToUnix(date1) AS diff ;

It might require a bit of formatting/typing but it should work.



来源:https://stackoverflow.com/questions/24448004/finding-the-difference-between-start-times-and-end-times-in-pig

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!