Projecting Grouped Tuples in Pig

旧街凉风 提交于 2019-12-08 19:08:46

问题


I have a collection of tuples of the form (t,a,b) that I want to group by b in Pig. Once grouped, I want to filter out b from the tuples in each group and generate a bag of filtered tuples per group.

As an example, assume we have (1,2,1) (2,0,1) (3,4,2) (4,1,2) (5,2,3)

The pig script would produce {(1,2),(2,0)} {(3,4),(4,1)} {(5,2)}

The question is: how do I go about producing this result? I'm used to seeing examples where aggregation operations follow a group by operation. It's less clear to me how to filter the tuples and return them in a bag. Thanks for your assistance!


回答1:


Turns out what I was looking for is the syntax for nested projection in Pig.

If one has tuples of the form (t,a,b) and wants to drop b after the group by, it is done this way.

grouped = GROUP tups BY b;
result = FOREACH grouped GENERATE tup.(t,a);

See the "Nested Projection" section on the PigLatin page. http://wiki.apache.org/pig/PigLatin



来源:https://stackoverflow.com/questions/10808202/projecting-grouped-tuples-in-pig

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!