Efficient time series querying in Postgres

前端 未结 4 1818
逝去的感伤
逝去的感伤 2020-12-30 13:05

I have a table in my PG db that looks somewhat like this:

id | widget_id | for_date | score |

Each referenced widget has a lot of these ite

4条回答
  •  佛祖请我去吃肉
    2020-12-30 13:45

    The data:

    DROP SCHEMA tmp CASCADE;
    CREATE SCHEMA tmp ;
    SET search_path=tmp;
    
    CREATE TABLE widget
            ( widget_id INTEGER NOT NULL
            , for_date DATE NOT NULL
            , score INTEGER
             , PRIMARY KEY (widget_id,for_date)
            );
    INSERT INTO widget(widget_id , for_date , score) VALUES
     (1312, '2012-05-07', 20)
    , (1337, '2012-05-07', 12)
    , (1337, '2012-05-08', 41)
    , (1337, '2012-05-11', 500)
            ;
    

    The query:

    SELECT w.widget_id AS widget_id
            , cal::date AS for_date
            -- , w.for_date AS org_date
            , w.score AS score
    FROM generate_series( '2012-05-07'::timestamp , '2012-05-11'::timestamp
                     , '1day'::interval) AS cal
            -- "half cartesian" Join;
            -- will be restricted by the NOT EXISTS() below
    LEFT JOIN widget w ON w.for_date <= cal
    WHERE NOT EXISTS (
            SELECT * FROM widget nx
            WHERE nx.widget_id = w.widget_id
            AND nx.for_date <= cal
            AND nx.for_date > w.for_date
            )
    ORDER BY cal, w.widget_id
            ;
    

    The result:

     widget_id |  for_date  | score 
    -----------+------------+-------
          1312 | 2012-05-07 |    20
          1337 | 2012-05-07 |    12
          1312 | 2012-05-08 |    20
          1337 | 2012-05-08 |    41
          1312 | 2012-05-09 |    20
          1337 | 2012-05-09 |    41
          1312 | 2012-05-10 |    20
          1337 | 2012-05-10 |    41
          1312 | 2012-05-11 |    20
          1337 | 2012-05-11 |   500
    (10 rows)
    

提交回复
热议问题