Below two queries are subqueries. Both are the same and both works fine for me. But the problem is Method 1 query takes about 10 secs to execute while Method 2 query takes under 1 sec.
I was able to convert method 1 query to method 2 but I don't understand what's happening in the query. I have been trying to figure it out myself. I would really like to learn what's the difference between below two queries and how does the performance gain happen ? what's the logic behind it ?
I'm new to these advance techniques. I hope someone will help me out here. Given that I read the docs which does not give me a clue.
Method 1 :
SELECT
*
FROM
tracker
WHERE
reservation_id IN (
SELECT
reservation_id
FROM
tracker
GROUP BY
reservation_id
HAVING
(
method = 1
AND type = 0
AND Count(*) > 1
)
OR (
method = 1
AND type = 1
AND Count(*) > 1
)
OR (
method = 2
AND type = 2
AND Count(*) > 0
)
OR (
method = 3
AND type = 0
AND Count(*) > 0
)
OR (
method = 3
AND type = 1
AND Count(*) > 1
)
OR (
method = 3
AND type = 3
AND Count(*) > 0
)
)
Method 2 :
SELECT
*
FROM
`tracker` t
WHERE
EXISTS (
SELECT
reservation_id
FROM
`tracker` t3
WHERE
t3.reservation_id = t.reservation_id
GROUP BY
reservation_id
HAVING
(
METHOD = 1
AND TYPE = 0
AND COUNT(*) > 1
)
OR
(
METHOD = 1
AND TYPE = 1
AND COUNT(*) > 1
)
OR
(
METHOD = 2
AND TYPE = 2
AND COUNT(*) > 0
)
OR
(
METHOD = 3
AND TYPE = 0
AND COUNT(*) > 0
)
OR
(
METHOD = 3
AND TYPE = 1
AND COUNT(*) > 1
)
OR
(
METHOD = 3
AND TYPE = 3
AND COUNT(*) > 0
)
)
An Explain Plan would have shown you why exactly you should use Exists. Usually the question comes Exists vs Count(*). Exists is faster. Why?
With regard to challenges present by NULL: when subquery returns
Null, for IN the entire query becomesNull. So you need to handle that as well. But usingExist, it's merely afalse. Much easier to cope. SimplyINcan't compare anything withNullbutExistscan.e.g.
Exists (Select * from yourtable where bla = 'blabla');you get true/false the moment one hit is found/matched.In this case
INsort of takes the position of theCount(*)to select ALL matching rows based on theWHEREbecause it's comparing all values.
But don't forget this either:
EXISTSexecutes at high speed againstIN: when the subquery results is very large.INgets ahead ofEXISTS: when the subquery results is very small.
Reference to for more details:
Method 2 is fast because it is using EXISTS operator, where I MySQL do not load any results.
As mentioned in your docs link as well, that it omits whatever is there in SELECT clause. It only checks for the first value that matches the criteria, once found it sets the condition TRUE and moves for further processing.
On the other side Method 1 has IN operator which loads all possible values and then matches it. Condition is set TRUE only when exact match is found which is time consuming process.
Hence your method 2 is fast.
Hope it helps...
The EXISTS operator is a Boolean operator that returns either true or false. The EXISTS operator is often used the in a subquery to test for an “exist” condition.
SELECT
select_list
FROM
a_table
WHERE
[NOT] EXISTS(subquery);
If the subquery returns any row, the EXISTS operator returns true, otherwise, it returns false.
In addition, the EXISTS operator terminates further processing immediately once it finds a matching row. Because of this characteristic, you can use the EXISTS operator to improve the performance of the query in some cases.
The NOT operator negates the EXISTS operator. In other words, the NOT EXISTS returns true if the subquery returns no row, otherwise it returns false.
You can use SELECT *, SELECT column, SELECT a_constant, or anything in the subquery. The results are the same because MySQL ignores the select_list that appears in the SELECT clause.
The reason is that the EXISTS operator works based on the “at least found” principle. It returns true and stops scanning table once at least one matching row found.
On the other hands, when the IN operator is combined with a subquery, MySQL must process the subquery first and then uses the result of the subquery to process the whole query.
The general rule of thumb is that if the subquery contains a large volume of data, the EXISTS operator provides a better performance.
However, the query that uses the IN operator will perform faster if the result set returned from the subquery is very small.
For detail explanations and examples: MySQL EXISTS - mysqltutorial.org
The second Method is faster because you've got this like there "WHERE t3.reservation_id = t.reservation_id". In the first case your subquery has to do a full scan into the table to verify the information. However at the 2o Method the subquery knows exactly what it is looking for and once it is found is checked the having condition then.
Their Official Documentation.SubQuery Optimization with Exists
来源:https://stackoverflow.com/questions/14190788/subqueries-with-exists-vs-in-mysql