How Do I transpose columns and rows in PIG

时光总嘲笑我的痴心妄想 提交于 2019-12-20 06:36:52

问题


I'm not sure if this can be done with builtin PIG scripts or I'll need to code a UDF. But I have essentially a table where I simply want to transpose the data.

Simple put, given:

(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)
(11, 12, 13, 14, 15)
 ... 300 plus more tuples

I would end up with:

(1,6,11,...) -> goes on for a few hundred more
(2,7,12,...)
(3,8,13,...)
(4,9,14,...)
(5,10,15,...)

Any suggestions on how I could accomplish this?


回答1:


This is not possible with Pig, nor does it make much sense for it to be. Remember that a relation is a bag of tuples, and by definition, a bag is not guaranteed to have its tuples in any specific order. You might start with

(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)
(11, 12, 13, 14, 15)

but from Pig's perspective there is no difference between this and

(11, 12, 13, 14, 15)
(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)

which means that "transpose" is ill-defined. Look at it this way -- if you transpose twice, you should end up with the same data structure back, but because the tuples can be reordered along the way, this is not guaranteed to happen.

In the end, if you really must do matrix operations, you would be better off using a tool that respects ordering in both rows and columns.

That said, what are you trying to accomplish?



来源:https://stackoverflow.com/questions/13498657/how-do-i-transpose-columns-and-rows-in-pig

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!