Avoid multiple calls on same function when expanding composite result

陌路散爱 提交于 2019-12-11 02:54:39

问题


I have an SQL function retuning a composite result.

CREATE TYPE result_t AS (a int, b int, c int, d numeric);

CREATE OR REPLACE FUNCTION slow_function(int)
RETURNS result_t
AS $$
    -- just some placeholder code to make it slow
    SELECT 0, 0, 0, (
        SELECT sum(ln(i::numeric))
        FROM generate_series(1, $1) i
    )
$$ LANGUAGE sql IMMUTABLE;

When calling the function, I would like to have the parts of the composite type expanded into several columns. That works fine when I call:

SELECT (slow_function(i)).*
FROM generate_series(0, 200) i

a     b     c     d                    
----  ----  ----  -------------------- 
0     0     0     (null)               
0     0     0     0                    
0     0     0     0.6931471805599453   
0     0     0     1.791759469228055
...
Total runtime: 6196.754 ms

Unfortunately, this causes the function to be called once per result column, which is unnecessarily slow. This can be tested by comparing the run time with a query, which directly returns the composite result and runs four times as fast:

SELECT slow_function(i)
FROM generate_series(0, 200) i
...
Total runtime: 1561.476 ms

The example code is also at http://sqlfiddle.com/#!15/703ba/7

How can I get the result with multiple columns without wasting CPU power?


回答1:


A CTE is not even necessary. A plain subquery does the job as well (tested with pg 9.3):

SELECT i, (f).*                     -- decompose here
FROM  (
   SELECT i, (slow_func(i)) AS f    -- do not decompose here
   FROM   generate_series(1, 3) i
   ) sub;

Be sure not to decompose the composite result of the function in the subquery. Reserve that for the outer query.
Requires a well known type, of course. Would not work with anonymous records.

Or, what @Richard wrote, a LATERAL JOIN works, too. The syntax can be simpler:

SELECT * FROM generate_series(1, 3) i, slow_func(i) f
  • LATERAL is applied implicitly in Postgres 9.3 or later.
  • A function can stand on its own in the FROM clause, doesn't have to be wrapped in an additional sub-select. Just imagine a table in its place.

SQL Fiddle with EXPLAIN VERBOSE output for all variants. You can see multiple evaluation of the function if it happens.

COST setting

Generally (should not matter for this particular query), make sure to apply a high cost setting to your function, so the planner knows to avoid evaluating more often then necessary. Like:

CREATE OR REPLACE FUNCTION slow_function(int)
  RETURNS result_t AS
$func$
    -- expensive body
$func$ LANGUAGE sql IMMUTABLE COST 100000;

Per documentation:

Larger values cause the planner to try to avoid evaluating the function more often than necessary.




回答2:


Perhaps a LATERAL subquery is what you want.

SELECT t.id, f.* FROM some_table t, LATERAL (SELECT slow_func(t.id)) f

That will call the function once for each row, and then "unwrap" the result into columns in the output. Any subquery will do for the "unwrapping" but LATERAL is what lets you reference columns from other clauses.

I believe LATERAL was introduced in PostgreSQL 9.3




回答3:


One way to work around this problem is by using a WITH clause.

WITH t AS (
    SELECT slow_function(i) AS s
    FROM generate_series(0, 200) i
)
SELECT (s).*
FROM t

This works because the result of the WITH clause gets materialized before the rest of the query is executed. But this solution is often bad for more complex cases, because it greatly reduces the query optimizer's options for improving the query execution in other ways. So I'm still looking for a better way to solve this problem.



来源:https://stackoverflow.com/questions/24391293/avoid-multiple-calls-on-same-function-when-expanding-composite-result

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!