MySQL, query too slow, how to improve it?

大兔子大兔子 提交于 2019-12-11 05:16:15

问题


The problem

I'm running a query using Workbench 5.2.35 and a MySQL server 5.5 and I have the error "Error Code: 2013. Lost connection to MySQL server during query" after 600.516 seconds also after little changes in the query. The query has two roles:

  1. select a particular type of records characterised by having 'value1' in 'col1' (pass from Stage A to Stage B)
  2. remove the records where the value in 'col2' is the same as the value in 'col2' of the next result (pass from Stage B to Stage C)

    Stage A             Stage B             Stage C
    ***************     ***************     ***************
    *ID *col1*col2*     *ID *col1*col2*     *ID *col1*col2*
    ***************     ***************     ***************
    *1  * A  * a  *     *3  * C  * a  *     *3  * C  * a  *
    *2  * B  * a  *     *7  * C  * f  *     *7  * C  * f  *
    *3  * C  * a  *     *8  * C  * f  *     *16 * C  * b  *
    *4  * S  * a  *     *9  * C  * f  *     *18 * C  * c  *
    *5  * B  * a  *     *16 * C  * b  *
    *6  * A  * g  *     *17 * C  * b  *
    *7  * C  * f  *     *18 * C  * c  *
    *8  * C  * f  *
    *9  * C  * f  *
    *10 * A  * f  *
    *11 * B  * f  *
    *12 * D  * f  *
    *13 * S  * f  *
    *14 * F  * f  *
    *15 * F  * f  *
    *16 * C  * b  *
    *17 * C  * b  *
    *18 * C  * c  *
    

and is a generalisation of: MySQL, select rows where a parameter value depends on the value that it has in a different row

The query is:

SELECT t.id, t.col2, t.col3, t.col4, t.col5 FROM tablename t
WHERE t.id < 1000000
    AND t.col1 = 'value1' 
    AND t.col2 <> 
    (SELECT col2 FROM tablename
        WHERE col1 = 'value1' 
        AND id > t.id 
        LIMIT 1);

Reason for the error

Now, from this post https://serverfault.com/questions/29597/what-does-mysql-error-2013-mean the causes of this error can be:

  1. Someone KILLed the query
  2. Network problems caused the connection to die
  3. The server crashed/died
  4. Your connection was idle for wait_timeout and was killed
  5. The client wasn't pulling data fast enough for net_wait_timeout and was killed

but since the query stops at 600.516 seconds, I guess that the problem in this case is number 4 (timeout).

Possible problems and solutions

The first idea would be to increase the wait_timeout time, but I think that this is triggered by a previous error: the query doesn't return anything but continues to run. The limit t.id < 1000000 is used exactly to test the query in a reasonably limited subset (the database has about 200 millions of entries) So, I'm thinking that there is some problem in the query, in particular in the passage between Stage B and Stage C (the previous step is trivial)

Any idea for the error or for the query will be much appreciated.

Thanks


The solution

this is the working code, inspired by the best answer. The command DISTINCT work, but finally I used GROUP BY and ORDER BY in order to present the results in a better way.

SELECT id, col1, col2, ..., coln FROM tablename
    WHERE col1 = 'value1' 
    AND col2 = 'value2'
    ... 
    AND coln = 'valuen'
    GROUP BY col2
    ORDER BY id;

回答1:


SELECT DISTINCT Par FROM table_name

http://www.w3schools.com/sql/sql_distinct.asp




回答2:


I would rewrite it using not in the query optimizer has a special case for that.
Also I would use a different trick to limit the number of results to one.

The problem with limit is that it first creates a temp table with all the results and than selects 1 row from that.

SELECT t.id, t.col2, t.col3, t.col4, t.col5 
FROM tablename t
WHERE t.id < 1000000
    AND t.col1 = 'value1' 
    AND t.col2 NOT IN 
    (SELECT col2 FROM tablename
        WHERE col1 = 'value1' 
        AND id = t.id+1)    <<--- assuming that `id` is the primary key.

If you have a compound index on (col1, col2) and use id as your primary key the query should not take forever.

Looking at your query, I'd rewrite it as:

SELECT t.id, t.col2, t.col3, t.col4, t.col5 
FROM tablename t
WHERE t.id IN ( 
  SELECT t2.id 
  FROM tablename t2
  WHERE t2.col1 = 'value1'
  GROUP BY t.col2)

This should do the trick, if I've studied the stages correctly.



来源:https://stackoverflow.com/questions/7848063/mysql-query-too-slow-how-to-improve-it

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!