return empty rows for not existsting data

问题

Ok, i have a table with a date column and a integer column, and i want to retrieve all the rows grouped by date's day within a certain date range; since there are not rows for every day, is it possible to make mysql return rows for those days with a default value?

example

source table:

date         value
2020-01-01   1
2020-01-01   2
2020-01-03   2
2020-01-07   3
2020-01-08   4
2020-01-08   1

Standard behaviour after grouping by date and summing values:

2020-01-01   3
2020-01-03   2
2020-01-07   3
2020-01-08   5

Desired behaviour/result with empty rows:

2020-01-01   3
2020-01-02   0
2020-01-03   2
2020-01-04   0
2020-01-05   0
2020-01-06   0
2020-01-07   3
2020-01-08   5

回答1:

You can do something like the below:

# table creation:

drop table if exists test_table;

create table test_table (your_date date, your_value int(11));
insert into test_table (your_date, your_value) values ('2020-01-01', 1);
insert into test_table (your_date, your_value) values ('2020-01-01', 2);
insert into test_table (your_date, your_value) values ('2020-01-03', 2);
insert into test_table (your_date, your_value) values ('2020-01-07', 3);
insert into test_table (your_date, your_value) values ('2020-01-08', 4);
insert into test_table (your_date, your_value) values ('2020-01-08', 1);

This creates a list of basically all the dates. You then filter for the dates your interested in, join with your table and group.

You could also replace the dates in the where statement with subqueries (min and max date of your table) to make it dynamic

It's a bit of a work-around but it works.

select sbqry.base_date, sum(ifnull(t.your_value, 0))
from (select adddate('1970-01-01',t4.i*10000 + t3.i*1000 + t2.i*100 + t1.i*10 + t0.i) base_date from
    (select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
    (select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
    (select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
    (select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3,
    (select 0 i union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t4) sbqry
left join test_table t on base_date = t.your_date
where sbqry.base_date between '2020-01-01' and '2020-01-08'
group by sbqry.base_date;

input:

+------------+------------+
| your_date  | your_value |
+------------+------------+
| 2020-01-01 |          1 |
| 2020-01-01 |          2 |
| 2020-01-03 |          2 |
| 2020-01-07 |          3 |
| 2020-01-08 |          4 |
| 2020-01-08 |          1 |
+------------+------------+

output:

+------------+------------------------------+
| base_date  | sum(ifnull(t.your_value, 0)) |
+------------+------------------------------+
| 2020-01-01 |                            3 |
| 2020-01-02 |                            0 |
| 2020-01-03 |                            2 |
| 2020-01-04 |                            0 |
| 2020-01-05 |                            0 |
| 2020-01-06 |                            0 |
| 2020-01-07 |                            3 |
| 2020-01-08 |                            5 |
+------------+------------------------------+

回答2:

You could also achieve what you want with the following query which may be easier to understand :

SELECT
     date_table.date,
     IFNULL(SUM(value),0) as sum_val
FROM (
     SELECT DATE_ADD('2020-01-01', INTERVAL (@i:=@i+1)-1 DAY) AS `date`
     FROM information_schema.columns,(SELECT @i:=0) gen_sub
     WHERE DATE_ADD('2020-01-01',INTERVAL @i DAY) BETWEEN '2020-01-01' AND '2020-01-08'
) date_table
LEFT JOIN test ON test.date_value = date_table.date
GROUP BY date;

FIND A DEMO HERE

You could set some variable to fix min and max dates :

SET @date_min = '2020-01-01';
SET @date_max = '2020-01-08';

SELECT DATE_ADD(@date_min, INTERVAL (@i:=@i+1)-1 DAY) AS `date`
FROM information_schema.columns, (SELECT @i:=0) gen_sub
WHERE DATE_ADD(@date_min, INTERVAL @i DAY) BETWEEN @date_min AND @date_max

Some explanation :

In fact, your question encourage us to generate a set of dates because we are looking to 'left join' 'your table' with a continuous set of date in order to match dates with no records in 'your table'.

This would be pretty easy in PostgreSQL because of generate_series function but this is not that easy in MySQL as such a useful function doesn't exist. That's why we need to be smart.

Both solutions here have the same logic behind it : I mean they are both incrementing a date value (day per day) for each row joined in another table, let's call it 'source table'. In the answer above (not mine), 'source table' is made with many unions and cross joins (it generates 100k rows), in my case here 'source table' is 'information_schema.columns' which already contains lots of rows (1800+).

In above case, initial date is fixed to 1970-01-01 and then it will increment this date 100 000 times in order to have a set of 100 000 dates beginning with 1970-01-01.

In my case, initial date is fixed to your min range date, 2020-01-01, and then it will increment this date for each row found in information_schema.columns, so around 1800 times. You will end with a set of around 1800 dates beginning with 2020-01-01.

Finally, you can left join your table with this generated set of dates (whatever the way to do it) in order to sum(value) for each day in your desired range.

Hope that would help you understand the logic behind both queries ;)

来源：https://stackoverflow.com/questions/59825829/return-empty-rows-for-not-existsting-data

标签

mysql

group-by

rows

intervals