Pig equivalent of SQL GREATEST / LEAST?

此生再无相见时 提交于 2019-12-19 09:57:10

问题


I'm trying to find the Pig equivalent of the SQL functions GREATEST and LEAST. These functions are the scalar equivalent of the aggregate SQL functions MAX and MIN, respectively.

Essentially, I want to be able to say something like this:

x = LOAD 'file:///a/b/c.csv' USING PigStorage() AS (a: int, b: int, c: int);
y = FOREACH x GENERATE a AS a: int, b AS b: int, c AS c: int, GREATEST(a, b, c) AS g: int;

I know I could use bags and MAX to get this done, but I'm translating from another language into Pig and that implementation would be difficult to integrate.

Is there an "inline" approach I could use here? Some builtin function I'm overlooking, or maybe a UDF in Piggybank or DataFu, for example, would be ideal! If there's a completely "inline" version that uses bags and I'm just not thinking of it, that's fine too!

Thank you!


回答1:


It turns out that there are "inline" bag-based approaches that work:

x = LOAD 'file:///a/b/c.csv' USING PigStorage() AS (a: int, b: int, c: int);
y = FOREACH x GENERATE a AS a: int, b AS b: int, c AS c: int, MAX(TOBAG(a, b, c)) AS g: int;


来源:https://stackoverflow.com/questions/27262945/pig-equivalent-of-sql-greatest-least

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!