How to fill missing dates by groups in a table in sql

前端未结

关注

 2  2010

I want to know how to use loops to fill in missing dates with value zero based on the start/end dates by groups in sql so that i have consecutive time series in each group.

相关标签:

2条回答

南旧

2020-12-16 15:21

You can do it like this without loops

SELECT p.date, COALESCE(a.value, 0) value, p.grp_no
  FROM
(
  SELECT grp_no, date
    FROM
  (
    SELECT grp_no, MIN(date) min_date, MAX(date) max_date
      FROM tableA
     GROUP BY grp_no
  ) q CROSS JOIN tableb b 
   WHERE b.date BETWEEN q.min_date AND q.max_date
) p LEFT JOIN TableA a
    ON p.grp_no = a.grp_no 
   AND p.date = a.date

The innermost subquery grabs min and max dates per group. Then cross join with TableB produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA and fills value column with 0 for dates that are missing in TableA.

Output:

|       DATE | VALUE | GRP_NO |
|------------|-------|--------|
| 2012-08-06 |     1 |      1 |
| 2012-08-07 |     0 |      1 |
| 2012-08-08 |     1 |      1 |
| 2012-08-09 |     0 |      1 |
| 2012-08-07 |     2 |      2 |
| 2012-08-08 |     1 |      2 |
| 2012-08-09 |     0 |      2 |
| 2012-08-10 |     0 |      2 |
| 2012-08-11 |     0 |      2 |
| 2012-08-12 |     3 |      2 |

Here is SQLFiddle demo

0 讨论(0)

情话喂你

2020-12-16 15:38
The following query does a union with tableA and tableB. It then uses group by to merge the rows from tableA and tableB so that all of the dates from tableB are in the result. If a date is not in tableA, then the row has 0 for value and grp_no. Otherwise, the row has the actual values for value and grp_no.
```
select
   dat,
   sum(val),
   sum(grp)
from
   (
      select
         date as dat,
         value as val,
         grp_no as grp
      from
         tableA
   union
      select
         date,
         0,
         0
      from
         tableB
      where
         date >= date '2012-08-06' and
         date <= date '2012-08-13'
   )
group by
   dat
order by
   dat
```
If find this query to be easier for me to understand. It also runs faster. It takes 16 seconds whereas a similar right join query takes 32 seconds.

This solution only works with numerical data.

This solution assumes a fixed date range. With some extra work this query can be adapted to limit the date range to what is found in tableA.
0 讨论(0)
发布评论:

提交评论
- 加载中...