MongoDB aggregate count of items in two arrays across different documents?

匆匆过客 提交于 2020-01-24 17:33:07

问题


Here is my MongoDB collection schema:

company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"

I need to aggregate it so I get the following output:

[{
  "_id": {
    "company": "Lenovo",
    "model": "T400"
  },
  "tags": {
    tag: "SomeTag"
    pros: 124 // number of times, "SomeTag" tag was found in "pros" array in `Lenovo T400`
    cons: 345 // number of times, "SomeTag" tag was found in "cons" array in `Lenovo T400`
  }
}...]

I tried to do the following:

var aggParams = {};
aggParams.push({ $unwind: '$cons' });
aggParams.push({ $unwind: '$pros' });
aggParams.push({$group: {
  _id: {
    company: '$company',
    model: '$model',
    consTag: '$cons'
  },
  consTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
  _id: {
    company: '$_id.company',
    model: '$_id.model',
    prosTag: '$pros'
  },
  prosTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
  _id: {
    company:'$_id.company',
    model: '$_id.model'
  },
  tags: { $push: { tag: { $or: ['$_id.consTag', '$_id.prosTag'] }, cons: '$consTagCount', pros: '$prosTagCount'} }
}});

But I got the following result:

{
  "_id": {
    "company": "Lenovo",
    "model": "T400"
  },
  "tags": [
    {
      "tag": false,
      "pros": 7
    }
  ]
}

What is the right way to do this with aggregation?


回答1:


Yes this is a bit harder considering that there are multiple arrays, and if you try both at the same time you end up with a "cartesian condition" where one arrray multiplies the contents of the other.

Therefore, just combine the array content at the beginning, which probably indicates how you should be storing the data in the first place:

Model.aggregate(
    [
        { "$project": {
            "company": 1,
            "model": 1,
            "data": {
                "$setUnion": [
                    { "$map": {
                        "input": "$pros",
                        "as": "pro",
                        "in": {
                            "type": { "$literal": "pro" },
                            "value": "$$pro"
                        }
                    }},
                    { "$map": {
                        "input": "$cons",
                        "as": "con",
                        "in": {
                            "type": { "$literal": "con" },
                            "value": "$$con"
                        }
                    }}
                ]
            }
        }},
        { "$unwind": "$data" }
        { "$group": {
            "_id": { 
                "company": "$company",
                "model": "$model",
                "tag": "$data.value"
            },
            "pros": { 
                "$sum": { 
                    "$cond": [
                        { "$eq": [ "$data.type", "pro" ] },
                        1,
                        0
                    ]
                }
            },
            "cons": { 
                "$sum": { 
                    "$cond": [
                        { "$eq": [ "$data.type", "con" ] },
                        1,
                        0
                    ]
                }
            }
        }
    ], 
    function(err,result) {

    }
)

So via the first $project stage the $map operators are adding the "type" value to each item of each array. Not that it really matters here as all items should process "unique" anyway, the $setUnion operator "contatenates" each array into a singular array.

As mentioned earlier, you probably should be storing in this way in the first place.

Then process $unwind followed by $group, wherein each "pros" and "cons" is then evaluated via $cond to for it's matching "type", either returning 1 or 0 where the match is respectively true/false to the $sum aggregation accumulator.

This gives you a "logical match" to count each respective "type" within the aggregation operation as per the grouping keys specified.



来源:https://stackoverflow.com/questions/32155801/mongodb-aggregate-count-of-items-in-two-arrays-across-different-documents

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!