问题
The problem
I'm running a query using Workbench 5.2.35 and a MySQL server 5.5 and I have the error "Error Code: 2013. Lost connection to MySQL server during query" after 600.516 seconds also after little changes in the query. The query has two roles:
- select a particular type of records characterised by having 'value1' in 'col1' (pass from Stage A to Stage B)
remove the records where the value in 'col2' is the same as the value in 'col2' of the next result (pass from Stage B to Stage C)
Stage A Stage B Stage C *************** *************** *************** *ID *col1*col2* *ID *col1*col2* *ID *col1*col2* *************** *************** *************** *1 * A * a * *3 * C * a * *3 * C * a * *2 * B * a * *7 * C * f * *7 * C * f * *3 * C * a * *8 * C * f * *16 * C * b * *4 * S * a * *9 * C * f * *18 * C * c * *5 * B * a * *16 * C * b * *6 * A * g * *17 * C * b * *7 * C * f * *18 * C * c * *8 * C * f * *9 * C * f * *10 * A * f * *11 * B * f * *12 * D * f * *13 * S * f * *14 * F * f * *15 * F * f * *16 * C * b * *17 * C * b * *18 * C * c *
and is a generalisation of: MySQL, select rows where a parameter value depends on the value that it has in a different row
The query is:
SELECT t.id, t.col2, t.col3, t.col4, t.col5 FROM tablename t
WHERE t.id < 1000000
AND t.col1 = 'value1'
AND t.col2 <>
(SELECT col2 FROM tablename
WHERE col1 = 'value1'
AND id > t.id
LIMIT 1);
Reason for the error
Now, from this post https://serverfault.com/questions/29597/what-does-mysql-error-2013-mean the causes of this error can be:
- Someone KILLed the query
- Network problems caused the connection to die
- The server crashed/died
- Your connection was idle for wait_timeout and was killed
- The client wasn't pulling data fast enough for net_wait_timeout and was killed
but since the query stops at 600.516 seconds, I guess that the problem in this case is number 4 (timeout).
Possible problems and solutions
The first idea would be to increase the wait_timeout time, but I think that this is triggered by a previous error: the query doesn't return anything but continues to run. The limit t.id < 1000000 is used exactly to test the query in a reasonably limited subset (the database has about 200 millions of entries) So, I'm thinking that there is some problem in the query, in particular in the passage between Stage B and Stage C (the previous step is trivial)
Any idea for the error or for the query will be much appreciated.
Thanks
The solution
this is the working code, inspired by the best answer. The command DISTINCT work, but finally I used GROUP BY and ORDER BY in order to present the results in a better way.
SELECT id, col1, col2, ..., coln FROM tablename
WHERE col1 = 'value1'
AND col2 = 'value2'
...
AND coln = 'valuen'
GROUP BY col2
ORDER BY id;
回答1:
SELECT DISTINCT Par FROM table_name
http://www.w3schools.com/sql/sql_distinct.asp
回答2:
I would rewrite it using not in the query optimizer has a special case for that.
Also I would use a different trick to limit the number of results to one.
The problem with limit is that it first creates a temp table with all the results and than selects 1 row from that.
SELECT t.id, t.col2, t.col3, t.col4, t.col5
FROM tablename t
WHERE t.id < 1000000
AND t.col1 = 'value1'
AND t.col2 NOT IN
(SELECT col2 FROM tablename
WHERE col1 = 'value1'
AND id = t.id+1) <<--- assuming that `id` is the primary key.
If you have a compound index on (col1, col2) and use id as your primary key the query should not take forever.
Looking at your query, I'd rewrite it as:
SELECT t.id, t.col2, t.col3, t.col4, t.col5
FROM tablename t
WHERE t.id IN (
SELECT t2.id
FROM tablename t2
WHERE t2.col1 = 'value1'
GROUP BY t.col2)
This should do the trick, if I've studied the stages correctly.
来源:https://stackoverflow.com/questions/7848063/mysql-query-too-slow-how-to-improve-it