How to project an alias using a wildcard?

二次信任 提交于 2019-12-13 16:35:10

问题


Once I do a join A by id, B by id, I get an alias with fields A::f..., B::f... Is there a way to project it on only the A fields?

C = join A by id, B by id;
D = filter C by B::n < 1000;
E = foreach D generate A::*;

I get

Unexpected character '*'

What I want is E with the schema identical to A (i.e., describe E and describe A should print the exact same things).

How do I do that?


回答1:


You can use a project-range expression to get part of the way there.

Unfortunately, there is no way to systematically strip the A:: prefix. If you know the name of the last field of A (suppose it's last), you can do this:

E = foreach D generate .. A::last;

If you wanted just the fields from B you would do

E = foreach D generate B::first ..;

If you really need to apply a specific schema, perhaps you could just define a macro that applies this schema whenever you need it, so you can overwrite any of the changes that come from grouping, joining, etc.




回答2:


There is no way to have a common alias name after joining. but you can generate specific columns from the join results. For Example,

A = load 'data1' as (id,name,addr);
B = load 'data2' as (id,name2,addr2); 
C = join A by id,B by id;        //Now C has id,name,addr,id,name2,addr2

D = Foreach C generate($0,$1,$2);

Now the relation D has the 'A' relation columns such as id,name,addr only.



来源:https://stackoverflow.com/questions/24812457/how-to-project-an-alias-using-a-wildcard

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!