Mongodb aggregation pipeline size and speed issue

ぐ巨炮叔叔 提交于 2019-12-05 08:34:44

If you're able to alter your schema design on the object collection to include a parent_id field, you can immediately remove the first 4 stages of your pipeline (the first $match, $lookup, $unwind, and $project). This will make the concern about Line 1 and Line 2 disappear.

For example, a document in the object collection would look like:

{
  "_id": "1",
  "name": "object1",
  "metaDataMap": {
    "SOURCE": [
      "ABC",
      "DEF"
    ],
    "DESTINATION": [
      "XYZ",
      "PDQ"
    ],
    "TYPE": [ ]
  },
  "parent_id": "1"
}

Thus you don't need the expensive $lookup and $unwind. The first 4 stages can then be replaced with:

{$match: {parent_id: id}}

Based on this idea, I did further optimization of the pipeline, which resulted in:

db.objects.aggregate([
     {$match: {parent_id: id}}
    ,{$project: {metaDataMap: {$filter: {input: {$objectToArray: '$metaDataMap'}, cond: {$ne: [[], '$$this.v']}}}}}
    ,{$unwind: '$metaDataMap'}
    ,{$unwind: '$metaDataMap.v'}
    ,{$group: {_id: '$metaDataMap.k', val: {$addToSet: '$metaDataMap.v'}}}
    ,{$project: {count: {$size: '$val'}}}
])

This will output:

{ "_id": "TYPE", "count": 2 }
{ "_id": "DESTINATION", "count": 4 }
{ "_id": "SOURCE", "count": 5 }
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!