Optimize mysql query for date group

问题

This my tables:

CREATE TABLE IF NOT EXISTS `test_dates` (
  `date` date NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `test_log` (
  `id` int(10) unsigned NOT NULL,
  `timest` datetime NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

ALTER TABLE `test_dates`
  ADD PRIMARY KEY (`date`);

ALTER TABLE `test_log`
  ADD PRIMARY KEY (`id`),
  ADD KEY `emissione` (`timest`);

I have this query to count logs per date:

SELECT d.date, COUNT(l.id) 
FROM test_dates d
LEFT JOIN test_log l ON l.timest>=d.date AND l.timest<d.date + INTERVAL 1 DAY 
GROUP BY d.date

table test_dates is indexed in date colum and test_log table is indexed in timest column.

But explaining this query I got query type "ALL" and NULL key.

+-----+--------------+--------+-------------+--------+----------------+----------+----------+------+--------+-----------+------------------------------------------------+--+
| id  | select_type  | table  | partitions  | type   | possible_keys  |   key    | key_len  | ref  | rows   | filtered  |                     Extra                      |  |
+-----+--------------+--------+-------------+--------+----------------+----------+----------+------+--------+-----------+------------------------------------------------+--+
|  1  | SIMPLE       | d      | NULL        | index  | PRIMARY        | PRIMARY  | 3        | NULL |   705  | 100.00    | Using index                                    |  |
|  1  | SIMPLE       | l      | NULL        | ALL    | emissione      | NULL     | NULL     | NULL | 98256  | 100.00    | Range checked for each record (index map: 0x2) |  |
+-----+--------------+--------+-------------+--------+----------------+----------+----------+------+--------+-----------+------------------------------------------------+--+

Why mysql cannot use table indexes?

Log tables has about 100000 rows and the query is very slow.

回答1:

Try running this as a correlated subquery:

SELECT d.date,
       (SELECT COUNT(l.id) 
        FROM log l 
        WHERE l.timest >= d.date AND l.timest < d.date + INTERVAL 1 DAY 
       ) as cnt
FROM dates d;

MySQL is not very good when using indexes with GROUP BY. Sometimes using a subquery can be a significant boost to performance. Your table has the correct indexes.

回答2:

If the index and correlated sub-query are not working for you, your better option may be to update your dates table and add a summary count column. Then, when you do an insert into your logs table, you add 1 to your counter in the dates table for the date in question. If no such record exists yet, add one and set its count to 1 since it is a new record.

Then, all you would need to do is select a sum() from your dates table based on the date range and never look at the details. Once a given date IS selected for possible review, then you could query the underlying data.

回答3:

Turn it around. First do an efficient GROUP BY on the second table (see subquery, below), then fill in the missing days (outer query):

SELECT  date,
        IFNULL(log.ct, 0) AS ct
    FROM  
      ( SELECT  DATE(timest) AS date,
                COUNT(*) AS ct
            FROM  test_log 
            GROUP BY date
      ) AS log
    RIGHT JOIN  test_dates AS d  USING(date);

If you want to limit the date range, add a WHERE clause in both the subquery and the outer query.

来源：https://stackoverflow.com/questions/36401234/optimize-mysql-query-for-date-group

标签

mysql

optimization

indexing