Understanding map syntax

感情迁移 提交于 2019-12-11 23:46:09

问题


I have some problems understanding how the map should be used.

Following this tutorial I created a file containing the following text:

[open#apache]
[apache#hadoop]

The, I was able to load that file without errors:

a = load 'data/file_name.txt' as (M:map [])

Now, how can I take the list of all the "values"? I.e.

(apache)
(hadoop) 

Furthermore, I have just started to learn Pig, therefore every hints is going to be very helpful.


回答1:


There is only one way to interact with a map, and that is to use the # operator. In order for it to have more functionality, you'll have to define some UDFs. Therefore the only way a map can really be used in pure pig is like:

B = FOREACH A GENERATE M#'open' ;

Which produces this as output:

(apache)
()

Note that the value after the # is a quoted string, it cannot change and must be set before the you run the job.

Also, notice that is creates a NULL for the second line, because that map does not contain a key with the vaule 'open'. This is slightly different then using FILTER on a schema of two chararrays key and value:

B = FILTER A BY key=='open' ;

Which produces the output:

(open,apache)

If only the value is desired, then it can be done simply by:

B = FOREACH (FILTER A BY key=='open') GENERATE value ;

Which produces:

(apache)

If keeping the NULLs is important, they can also be generated by using a bincond:

B = FOREACH A GENERATE (key=='open'?value:NULL) ;

Which produces the same output as M#'open'.

From my experience maps are not very useful because of how restrictive they are.



来源:https://stackoverflow.com/questions/17772308/understanding-map-syntax

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!