Google cloud Big query UDF limitations

只愿长相守 提交于 2019-12-12 04:32:17

问题


I am facing a problem in Google bigquery. I have some complex computation need to do and save the result in Bigquery. So we are doing that complex computation in Java and saving result in google bigquery with the help of Google cloud dataflow.

But this complex calculation is taking around 28 min to complete in java. Customer requirement is to do within 20 sec.

So we switch to Google bigquery UDF option. One option is Bigquery legacy UDF. Bigquery legacy UDF have limitation that it is processing row one by one so we phased out this option. As we need multiple rows to process the results.

Second option is Scalar UDF. Big query scalar UDF are only can be called from WEB UI or command line and can not be trigger from java client.

If any one have any idea the please provide the direction on the problem how to proceed.


回答1:


You can use scalar UDFs with standard SQL from any client API, as long as the CREATE TEMPORARY FUNCTION statements are passed in the query attribute of the request. For example,

QueryRequest queryRequest =
    QueryRequest
        .newBuilder(
            "CREATE TEMP FUNCTION GetWord() AS ('fire');\n"
                + "SELECT COUNT(DISTINCT corpus) as works_with_fire\n"
                + "FROM `bigquery-public-data.samples.shakespeare`\n"
                + "WHERE word = GetWord();")
        // Use standard SQL syntax for queries.
        // See: https://cloud.google.com/bigquery/sql-reference/
        .setUseLegacySql(false)
        .build();
QueryResponse response = bigquery.query(queryRequest);



回答2:


Big query scalar UDF are only can be called from WEB UI or command line and can not be trigger from java client.

This is not accurate. Standard SQL supports scalar UDFs through CREATE TEMPORARY FUNCTION statement which can be used from any application and any client - it is simply part of the SQL query:

https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions

To learn how to enable Standard SQL, see this documentation: https://cloud.google.com/bigquery/docs/reference/standard-sql/enabling-standard-sql Particularly, simplest thing would be to add #standardSql tag at the beginning of SQL query.



来源:https://stackoverflow.com/questions/43050645/google-cloud-big-query-udf-limitations

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!