MySQL “IN” queries terribly slow with subquery but fast with explicit values

独自空忆成欢 提交于 2019-11-27 13:35:51

Subqueries execute every time you evaluate them (in MySQL anyway, not all RDBMSes), i.e. you're basically running 7 million queries! Using a JOIN, if possible, will reduce this to 1. Even if adding indexing improves performance of those, you're still running them.

Yes, IN with subqueries is slow. Use a join instead.

SELECT
COUNT(DISTINCT subscriberid)
FROM em_link_data JOIN em_link ON em_link_data.linkid=em_link.id
WHERE em_link.campaignid = '2900' AND em_link.link != 'open'

And make sure you've defined indexes on em_link_data.linkid and em_link.id.

Alessandro

The problem is that MySQL executes queries from outside to inside, while you might think that your subquery is done once and then its results are passed to the WHERE expression of the outer query (see MySQL documentation).

If you can't rewrite your query, you should do the following optimizations:

  • add an index on campaignid and link as FrustratedWithFormsDesigner said
  • check that the subquery uses indexes correctly by doing EXPLAIN SELECT ...
  • enable and tweak query cache, as that should speed up the subquery being called multiple times

One more idea would be to install MySQL proxy and write a little script that intercepts your query and rewrites it to use a join.

If your subquery is fast thus campaignid and link are absolutely indexed. l.id is PK and clustered thus is fast. But as far as I remember(from last time I checked this subject) , mysql describes about its internal optimizations for "in" subqueries to use the index sort of subquery result to improve performance and also uses cache for the left side of "IN" to drag it inside the subquery faster and if indexes are set true it must not have such difference to use inner join or "IN" rather than caching and it may be due to cache problem and massive data. http://dev.mysql.com/doc/internals/en/transformation-scalar-in.html

I don't know the situation of the software but if you can use INNER JOIN and you have (probably) some additional definitions before the IN clause in the WHERE clause of your outer query make sure to move that clauses to before of your main INNER JOIN via a temporary INNER JOIN behaves similar to an intervenient "where" clause sequentially and reduces the number of cross comparisons in a JOIN like this :

SELECT ... FROM t
INNER JOIN (SELECT 1) AS tmp ON t.asd=23
INNER JOIN t2 ON ...

Sample comparisons of normal and temp join lookups : 1000 * 1000 > 1000 + (100 * 1000)

Also it seems the subquery is filtered by constant vals thus if it was me I was gonna put the clauses in a sub query generating the resultset and reduce the number of comparisons in a JOIN like this :

SELECT ... FROM t
INNER JOIN (SELECT ... FROM t2 WHERE constant clauses) AS tbl2 ON ...

Anyway, in the "IN" query, comparing any column of the table in the subquery to any column of the table in outer query requires the columns of both side to be precisely indexed (regard to composite indexes) but still it may be a cache problem.

EDITED : Also I was curious to ask : Can the making a composite index on l.campaignid, l.link and l.id make any sense?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!