What should be indexed to improve performance?

六月ゝ 毕业季﹏ 提交于 2019-12-24 11:37:12

问题


Given this query, which column or columns should be indexed to optimize query performance?

SELECT *
  FROM `activities`
 WHERE (user_id = 90000 AND activity_type_id IN(300,400,808,9494))
 ORDER BY created_at DESC
 LIMIT 70

回答1:


In general, the selection filters can use indexes on user_id or activity_type_id or both (in either order).

The ordering operation might be able to use a filter on created_at.

It is likely that for this query, a composite index on (user_id, activity_type_id) would give the best result, assuming that MySQL can actually make use of it. Failing that, it is likely to be better to index user_id than activity_type_id because it is likely to provide better selectivity. One reason for thinking that is that there would be 4 subsections of the index to scan if it uses an index on activity_type_id, compared with just one subsection to scan if it uses an index on user_id alone.

Trying to rely on an index for the sort order is likely to mean a full table scan, so it is less likely to be beneficial. I would not create an index on created_at to support this query; there might be other queries where it would be beneficial.




回答2:


Certainly all columns in WHERE clauses should be indexed.

But the IN clause is likely to require a table scan.

I think reading these answers might help:

MySQL not using indexes with WHERE IN clause?




回答3:


You are doing a lookup on user_id and activity_type_id, so create indexes on both columns.




回答4:


I would index just user_id..




回答5:


Assuming you're not hiding a JOIN on the actual production code, indexing the "activity_type_id" should be the best one.




回答6:


I'd add two indexes on the activities table, one on (user_id, activity_type_id) and the other on (created_dt). I'd also look really hard at just which fields from the 'activities' table are actually used; if you can reduce the number of fields retrieved you may improve response time. I'd also grab the query plan before making any changes to the database, then compare it to the plan generated after making any/all changes.

Share and enjoy.




回答7:


i wouldnt create any additional indexes at all, instead i would have designed my table so it takes full advantage of an innodb clustered primary key !

create table activities
(
user_id int unsigned not null,
activity_id smallint unsigned not null,
primary key (user_id, activity_id) -- composite clustered primary key order is important
)
engine=innodb;

or

create table activities
(
user_id int unsigned not null,
activity_id smallint unsigned not null,
primary key (activity_id, user_id) -- hmmmm the other way round, why is that ?
)
engine=innodb;

Also, have a read of the following:

MySQL and NoSQL: Help me to choose the right one

How to avoid "Using temporary" in many-to-many queries?

60 million entries, select entries from a certain month. How to optimize database?

Rewriting mysql select to reduce time and writing tmp to disk

Hope it helps and remember innodb FTW ;)




回答8:


To make the right decision you have to take into consideration the following:

If the user_id is part of the primary key (you said it probably is), then is the primary key the clustered index for the table? If it is, is the user_id in the first position in the clustered index? If so, then how many activities do you expect to have per user? If there are 1-40 activities per user, then adding another index wouldn't be useful and would hurt insert performance. The reason is that all of the activity rows for a user will be clustered together and will likely be on the same database page, so adding activity_type_id to the index won't help.

If the primary key is not clustered and the user_id is not in the first position in the primary key, or the user_id is not in the primary key, then you're best bet is a non-clustered index with user_id, activity_type_id. The query optimizer should be smart enough to use the index since both user_id and activity_type_id are in the where statement even with the presence of the IN clause. You could also add created_at at the end of the index since you are ordering the query results that way.

Take care in creating indexes specifically for one query, but if the query is heavily used, it's often necessary.



来源:https://stackoverflow.com/questions/6165286/what-should-be-indexed-to-improve-performance

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!