BigQuery User Defined Aggregation Function?

我的未来我决定 提交于 2019-11-30 04:23:35

问题


I know I can define a User Defined Function in order to perform some custom calculation. I also know I can use the 'out-of-the-box' aggregation functions to reduce a collection of values to a single value when using a GROUP BY clause.

Is it possible to define a custom user-defined, Aggregation Function to use with a GROUP BY clause?


回答1:


Turns out that this IS possible (as long as the groups we seek to aggregate are of a reasonable size in memory) with a little bit of 'glue' - namely the ARRAY_AGG function

The steps are as follows:

  1. Create a UDF with an input parameter of type ARRAY<T> where T is the type of value you want to aggregate.
  2. Use the ARRAY_AGG function in the query with the GROUP BY clause to generate an array of T and pass into your UDF.

As a concrete example:

CREATE TEMP FUNCTION aggregate_fruits(fruits ARRAY<STRING>)
RETURNS STRING
LANGUAGE js AS """
return "my fruit bag contains these items: " + fruits.join(",");
""";

WITH fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)

SELECT aggregate_fruits(ARRAY_AGG(fruit))
FROM fruits


来源:https://stackoverflow.com/questions/43974448/bigquery-user-defined-aggregation-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!