Efficient time series querying in Postgres

前端 未结 4 1810
逝去的感伤
逝去的感伤 2020-12-30 13:05

I have a table in my PG db that looks somewhat like this:

id | widget_id | for_date | score |

Each referenced widget has a lot of these ite

4条回答
  •  长发绾君心
    2020-12-30 13:45

    Using your table structure, I created the following Recursive CTE which starts with your MIN(For_Date) and increments until it reaches the MAX(For_Date). Not sure if there is a more efficient way, but this appears to work well:

    WITH RECURSIVE nodes_cte(widgetid, for_date, score) AS (
    -- First Widget Using Min Date
     SELECT 
        w.widgetId, 
        w.for_date, 
        w.score
     FROM widgets w 
      INNER JOIN ( 
          SELECT widgetId, Min(for_date) min_for_date
          FROM widgets
          GROUP BY widgetId
       ) minW ON w.widgetId = minW.widgetid 
            AND w.for_date = minW.min_for_date
    UNION ALL
     SELECT 
        n.widgetId,
        n.for_date + 1 for_date,
        coalesce(w.score,n.score) score
     FROM nodes_cte n
      INNER JOIN (
          SELECT widgetId, Max(for_date) max_for_date
          FROM widgets 
          GROUP BY widgetId
       ) maxW ON n.widgetId = maxW.widgetId
      LEFT JOIN widgets w ON n.widgetid = w.widgetid 
        AND n.for_date + 1 = w.for_date
      WHERE n.for_date + 1 <= maxW.max_for_date
    )
    SELECT * 
    FROM nodes_cte 
    ORDER BY for_date
    

    Here is the SQL Fiddle.

    And the returned results (format the date however you'd like):

    WIDGETID   FOR_DATE                     SCORE
    1337       May, 07 2012 00:00:00+0000   12
    1337       May, 08 2012 00:00:00+0000   41
    1337       May, 09 2012 00:00:00+0000   41
    1337       May, 10 2012 00:00:00+0000   41
    1337       May, 11 2012 00:00:00+0000   500
    

    Please note, this assumes your For_Date field is a Date -- if it includes a Time -- then you may need to use Interval '1 day' in the query above instead.

    Hope this helps.

提交回复
热议问题