Aggregation pipeline and indexes

心已入冬 提交于 2019-12-21 04:14:39

问题


From http://docs.mongodb.org/manual/core/indexes/#multikey-indexes, it is possible to create an index on an array field using a multikey index. http://docs.mongodb.org/manual/applications/aggregation/#pipeline-operators-and-indexes lists some ways of how an index can be used in aggregation framework. However, there may be times that I may need to perform an $unwind on an array field to perform a $group. My question is, are multikey indexes (or any index using such array field) can still be used once they are operated on in the middle of the pipeline?


回答1:


Generally, only pipeline operators that can be flattened to a normal query ($match, $limit, $sort, and $skip) will be able to use the indexes on a collection. This is one of the reasons the $geoNear operator added in 2.4 has to be at the start of the pipeline.

Once you mutate the documents with $project, $group, or $unwind the index is no longer valid/usable.

If you have an index on an array field you can still use it before the $unwind to speed up the selection of documents to pipeline and then further refine the selected documents with a second $match.

Consider documents like:

{ tags: [ 'cat', 'bird', 'blue' ] }

With an index on tags.

If you only wanted to group the tags starting with b then you could perform an aggregation like:

{ pipeline: [
      { $match : { tags : /^b/ } },
      { $unwind : '$tags' },
      { $match : { tags : /^b/ } },
      /* the rest */
  ] }

The first $match does the coarse grain match using the index on tags.

The second match after the $unwind won't be able to use the index (the document above is now 3 documents) but can evaluate each of those documents to filter out the extra documents that get created (to remove { tags : 'cat' } from the example).

HTH - Rob.




回答2:


Hmm @Rob does give the right answer but I see how he could lead you down the wrong path a little:

If you have an index on an array field you can still use it before and after the $unwind to speed up the selection of documents to pipeline and then further refine the selected documents.

Basically the example he gives:

{ pipeline: [
      { $match : { tags : /^b/ } },
      { $unwind : '$tags' },
      { $match : { tags : /^b/ } },
      /* the rest */
  ] }

Will not use a multikey index past $unwind. So it will be able to search for all ROOT documents which have a tag name starting with b however, it will not be able to $unwind and then filter the subdocuments out in the second $match using an index.

The $match will only work on an index before the mutation.

So basically once you have mutated the document and loaded it onto the pipeline it becomes almost impossible to use an index currently.



来源:https://stackoverflow.com/questions/15606963/aggregation-pipeline-and-indexes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!