How to tune self-join table in mysql like this?

随声附和 提交于 2019-12-13 06:14:09

问题


I have this table which I'm trying to select from and to date. The query took 2 min to run on 4 million records. I'm not sure how much more I can squeeze out of this query.

    SELECT c.fk_id, c.from_date, c.fk_pb, MIN(o.from_date) AS to_date  
    FROM TABLE_X c  
        INNER JOIN TABLE_X o ON c.fk_id =  o.fk_id AND c.fk_pb = o.fk_pb  
WHERE o.from_date > c.from_date  
        GROUP BY c.fk_id, c.from_date, c.fk_pb

There are indexes on from_date, fk_pb and fk_id already.

The schema is like this.

+-----------------------------+---------------+------+-----+---------+-------+
| Field                       | Type          | Null | Key | Default | Extra |
+-----------------------------+---------------+------+-----+---------+-------+
| FK_ID                       | int(11)       | YES  | MUL | NULL    |       |
| FK_PB                       | int(11)       | YES  | MUL | NULL    |       |
| FROM_DATE                   | date          | YES  | MUL | NULL    |       |
| TO_DATE                     | date          | YES  |     | NULL    |       |
+-----------------------------+---------------+------+-----+---------+-------+

I know I should not use self-join at all in MySQL, but the data comes like this and I'm trying to find the best way to select from and to date out of this table. If there's anything else I could do to make this one faster that'd be great.

Thanks a lot.

UPDATED

+----+-------------+-------+------+----------------------------------------------------------------------+-------------------------+---------+----------------------------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys                                                        | key                     | key_len | ref                                    | rows    | Extra                                        |
+----+-------------+-------+------+----------------------------------------------------------------------+-------------------------+---------+----------------------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | c     | ALL  | IDX_FK_PB,IDX_FK_ID,IDX_FRM_DATE                                     | NULL                    | NULL    | NULL                                   | 4527750 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ref  | IDX_FK_PB,IDX_FK_ID,IDX_FRM_DATE                                     | IDX_FK_ID               | 5       | db.c.FK_ID                             |     110 | Using where                                  |
+----+-------------+-------+------+----------------------------------------------------------------------+-------------------------+---------+----------------------------------------+---------+----------------------------------------------+

回答1:


Adding an index to all relevant columns speeds this up:

INDEX(FK_ID, FK_PB,FROM_DATE)

Which performs better because:

  • MySQL can use the index for all the rows for c, so it doesn't need to go back to the table for this (adding a column not in the index would slow it down again a bit).
  • MySQL is pretty bad at index merging, and so often chooses not to use it (luckily), and when it does it's often suboptimal.
  • Well, an index covering all you search for (in this case, join on = searching for) is faster then MySQL electing one of the indexes (the most restrictive one, SHOW INDEX FROM tablename can show you the cardinality) on the separate columns to use and having to scan for the values in the other columns.


来源:https://stackoverflow.com/questions/16553536/how-to-tune-self-join-table-in-mysql-like-this

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!