What is the difference between GROUP and COGROUP in PIG?

China☆狼群 提交于 2019-12-13 11:47:25

问题


I understood Group didn't work with multiple tuples and hence we had COGROUP in PIG. However, while checking today the GROUP command works for me. I am using PIG-0.12.0. My commands and outputs are as follows.

grunt> grpvar = GROUP C by $2, B by $2;
grunt> cogrpvar = COGROUP C by $2, B by $2;
grunt> describe grpvar;

grpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}

grunt> describe cogrpvar;

cogrpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}

Is GROUP expected to work like this? What is the difference between GROUP and COGROUP them?


回答1:


Yes group is supposed to work like that !

According to the documentation ( http://pig.apache.org/docs/r0.12.0/basic.html#group ) :

Note: The GROUP and COGROUP operators are identical. Both operators work with one or more relations. For readability GROUP is used in statements involving one relation and COGROUP is used in statements involving two or more relations. You can COGROUP up to but no more than 127 relations at a time.

So it is just for readability, no differences between the two.



来源:https://stackoverflow.com/questions/25028629/what-is-the-difference-between-group-and-cogroup-in-pig

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!