creating covered index for aggregation framework

问题

I have a problem with creating index for my query and can't find any similar solution on the web, so maybe some of you will help me.

To simplify problem let's say we have Phones with some attributes,

{
  "type":"Samsung",
  "model":"S3",
  "attributes":[{
     "value":"100",
     "name":"BatteryLife"
   },{
     "value":"200$",
     "name":"Price"
   }
}

With index: {"type":1, "attributes.value":1}

We have millions of phones for every type and i want to find phones for given type that have given attributes, my query looks like:

db.Phone.aggregate([ 
{ "$match" : { "type" : "Samsung"}} , 
{ "$match" : { "attributes" : { "$all" : [ 
    { "value" : "100", "name" : "BatteryLife" } , 
    { "value" : "200$", "name" : "Price"}
                              ]}
             }
 }
])

And it works ! The problem is that this query is highly inefficient, beacuse it use only first part of my index, that is "type"(and i have millions of phones of every type), and doesn't use 'attributes.value' part (type + attributes.value is almost unique, so it would reduce complexity significantly).

@Edit Thanks to Neil Lunn i know it's because index is used only in my first match, so i have to change my query.

@Edit2 I think i found solution:

db.Phone.aggregate([
{$match: {
    $and: [ 
        {type: "Samsung"}, 
        {attributes: {
           $all: [
                { "value":"100", "type" : "BatteryLife" },
                { "value":"200$", "type" : "Price" }
           ] 
        }}
    ]}
}])

+db.Phone.ensureIndex({type:1, attributes:1}), seems to work. I think we can close now. Thanks for tip about $match.

回答1:

To get the most out of the index you need to have a $match early enough in the pipeline that uses all the fields in the index. And avoid using $and operator since it's unnecessary and in the current (2.4) version can cause an index not to be fully utilized (luckily fixed for the upcoming 2.6).

However, the query is not quite correct as you need to use $elemMatch to make sure the same element is used to satisfy the name and value fields.

Your query should be:

db.Phone.aggregate([
{$match: {  type: "Samsung", 
           attributes: { $all: [
                {$elemMatch: {"value":"100", "type" : "BatteryLife" }},
                {$elemMatch: {"value":"200$", "type" : "Price" }}
           ] }
        }
}]);

Now, it's not going to be a covered query, since the attributes.value and name are embedded, not to mention the fact that name is not in the index.

You need the index to be {"type":1, "attributes.value":1, "attributes.name":1} for best performance, though it still won't be covered, it'll be much more selective than now.

来源：https://stackoverflow.com/questions/22091373/creating-covered-index-for-aggregation-framework

标签

mongodb

indexing

aggregation-framework