Recursive SQL statement (PostgreSQL 9.1.4)

前端 未结 2 2419
余生分开走
余生分开走 2021-02-20 13:31

PostgreSQL 9.1

Business situation

Every month, there is a new batch of accounts given to a specific process. Every batch can be described by mon

2条回答
  •  轮回少年
    2021-02-20 14:22

    It's a big task, split it up to make it more manageable. I would put that in a plpgsql function with RETURN TABLE:

    1. Create a temporary table for your "Calculation Process" matrix using a crosstab query You need the tablefunc module installed for that. Run (once per database):

      CREATE EXTENSION tablefunc;
      
    2. Update the temp table field by field.

    3. Return table.

    The following demo is fully functional and tested with PostgreSQL 9.1.4. Building on the table definition provided in the question:

    -- DROP FUNCTION f_forcast();
    
    CREATE OR REPLACE FUNCTION f_forcast()
      RETURNS TABLE (
      granularity date
     ,entry_accounts numeric
     ,entry_amount numeric
     ,d1 numeric
     ,d2 numeric
     ,d3 numeric
     ,d4 numeric
     ,d5 numeric
     ,d6 numeric) AS
    $BODY$
    BEGIN
    
    --== Create temp table with result of crosstab() ==--
    
    CREATE TEMP TABLE matrix ON COMMIT DROP AS
    SELECT *
    FROM   crosstab (
            'SELECT granularity, entry_accounts, entry_amount
                   ,distance_in_months, recovery_amount
             FROM   vintage_data
             ORDER  BY 1, 2',
    
            'SELECT DISTINCT distance_in_months
             FROM   vintage_data
             ORDER  BY 1')
    AS tbl (
      granularity date
     ,entry_accounts numeric
     ,entry_amount numeric
     ,d1 numeric
     ,d2 numeric
     ,d3 numeric
     ,d4 numeric
     ,d5 numeric
     ,d6 numeric
     );
    
    ANALYZE matrix; -- update statistics to help calculations
    
    
    --== Calculations ==--
    
    -- I implemented the first calculation for X1 and leave the rest to you.
    -- Can probably be generalized in a loop or even a single statement.
    
    UPDATE matrix m
    SET    d4 = (
        SELECT (sum(x.d1) + sum(x.d2) + sum(x.d3) + sum(x.d4))
                /(sum(x.d1) + sum(x.d2) + sum(x.d3)) - 1
                -- removed redundant sum(entry_amount) from equation
        FROM  (
            SELECT *
            FROM   matrix a
            WHERE  a.granularity < m.granularity
            ORDER  BY a.granularity DESC
            LIMIT  3
            ) x
        ) * (m.d1 + m.d2 + m.d3)
    WHERE m.granularity = '2012-04-30';
    
    --- Next update X2 ..
    
    
    --== Return results ==--
    
    RETURN QUERY
    TABLE  matrix
    ORDER  BY 1;
    
    END;
    $BODY$ LANGUAGE plpgsql;
    

    Call:

    SELECT * FROM f_forcast();
    

    I have simplified quite a bit, removing some redundant steps in the calculation.
    The solution employs a variety of advanced techniques. You need to know your way around PostgreSQL to work with this.

提交回复
热议问题