accessing an element like array in pig

邮差的信 提交于 2019-12-13 02:38:29

问题


I have data in the form: id,val1,val2

example

1,0.2,0.1
1,0.1,0.7
1,0.2,0.3
2,0.7,0.9
2,0.2,0.3
2,0.4,0.5

So first I want to sort each id by val1 in decreasing order..so somethng like

1,0.2,0.1
1,0.2,0.3
1,0.1,0.7
2,0.7,0.9
2,0.4,0.5
2,0.2,0.3

And then select the second element id,val2 combination for each id So for example:

  1,0.3
  2,0.5

How do I approach this?

Thanks


回答1:


Pig is a scripting language and not relational one like SQL, it is well suited to work with groups with operators nested inside a FOREACH. Here is the solutions:

A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:float, v2:float);
B = GROUP A BY id; -- isolate all rows for the same id
C = FOREACH B { -- here comes the scripting bit
    elems = ORDER A BY v1 DESC; -- sort rows belonging to the id
    two = LIMIT elems 2; -- select top 2
    two_invers = ORDER two BY v1 ASC; -- sort in opposite order to bubble second value to the top
    second = LIMIT two_invers 1;
    GENERATE FLATTEN(group) as id, FLATTEN(second.v2);
};
DUMP C;

In your example id 1 has two rows with v1 == 0.2 but different v2, thus the second value for the id 1 can be 0.1 or 0.3




回答2:


A = LOAD 'input' USING PigStorage(',') AS (id:int, v1:int, v2:int);
B = ORDER A BY id ASC, v1 DESC;
C = FOREACH B GENERATE id, v2;
DUMP C;


来源:https://stackoverflow.com/questions/13253863/accessing-an-element-like-array-in-pig

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!