mysql not using index?

后端未结

关注

 2  872

终归单人心 2020-12-19 23:25

I have a table with columns like word, A_, E_, U_ .. these columns with X_ are tinyints having the value of how many times the specific letter exists in the word (to later h

2条回答

清歌不尽 (楼主)

2020-12-20 00:23
You're asking about the backend query optimizer. In particular you're asking: "how does it choose an access path? Why index here but tablescan there?"

Let's think about that optimizer. What is it optimizing? Elapsed time, in expectation. It has a model for how long sequential reads and random reads take, and for query selectivity, that is, expected number of rows returned by a query. From several alternative access paths it chooses the one that appears to require the least elapsed time.

Your id > 250000 query had a few things going for it:
1. good selectivity, so less than 1% of rows will appear in the result set
2. id is the Primary Key, so all columns are immediately available upon navigating to the right place in the btree
This caused the optimizer to compute an expected elapsed time for the indexed access path much smaller than expected time for tablescan.

On the other hand, your u_ > 0 query has very poor selectivity, dragging nearly a quarter of the rows into the result set. Additionally, the index is not a covering index for your * demand of copying all column values into the result set. So the optimizer predicts it will have to read a quarter of the index blocks, and then essentially all of the data row blocks that they point to. So compared to tablescan, we'd have to read more blocks from disk, and they would be random reads instead of sequential reads. Both of those argue against using the index, so tablescan was selected because it was cheapest. Also, remember that often multiple rows will fit within a single disk block, or within a single read request. We would call it a pessimizer if it always chose the indexed access path, even in cases where indexed disk I/O would take longer.

summary advice

Use an index on a single column when your queries have good selectivity, returning much less than 1% of a relation's rows. Use a covering index when your queries have poor selectivity and you're willing to make a space vs. time tradeoff.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

mysql not using index?

summary advice