问题
Setup
I'm using Single Table Inheritance (STI) in Rails simplified to the following:
class Vehicle
belongs_to :user
end
class Car < Vehicle
end
class Plane < Vehicle
end
Each record in the vehicles table will have a type column set to either 'Car' or 'Plane' in addition to the user_id foreign key. It could also have additional values if more vehicle types are added, however, type will always have a much lower cardinality than user_id. Just as in real life, I expect this table to contain many more Cars.
There is a compound index on [:user_id, :type] (in that order) and these records are looked up by their subclasses.
Question(s)
I believe that in the worst case of no Planes, the index will be used since user_id is first and the second part will essentially be ignored. In this case a single index would have a super small benefit in that it's not maintaining the compound second column.
What happens in the case where there's an equal split?
- Will the index cut the records in half and thus have a decent effect?
- Will the cost of the database maintaining a compound index over a single one (i.e. just
user_id) exceed or negate any savings?
Example SQL
An example ActiveRecord call would be Car.where(user_id: 10) which generates the following SQL:
SELECT `vehicles`.* FROM `vehicles` WHERE `vehicles`.`type` IN ('Car')
AND `vehicles`.`user_id` = 10
回答1:
The cost of maintaining an index (single-column or multi-column) is almost always outweighed by the performance improvement when that index is used. It is a small increment on each INSERT/DELETE, plus a cost if changing the value of an indexed field via UPDATE. (The UPDATE case is rare.) So, don't worry about the cost of "maintaining a compound index".
WHERE `vehicles`.`type` IN ('Car')
AND `vehicles`.`user_id` = 10
needs INDEX(user_id, type).
The optimizer will
- discover that that index is a possible candidate,
- check some statistics, then
- either use the index, or decide the cardinality is poor and simply scan the table.
Include the index; don't worry about it.
I ordered the fields that way, not (type, user_id) based on your IN, which implies that you might sometimes have multiple values for type.
If all rows in the table have type = 'Car', no problem. Everything I have said still applies. The waste of including the unnecessary type is insignificant.
It is better to have all "=" column(s) first in an index, then at most one other field. Further discussion here.
来源:https://stackoverflow.com/questions/33486907/will-a-compound-index-with-a-second-column-of-low-cardinality-effect-performance