问题
imagine I have a collection like this:
{
"_id": "10280",
"city": "NEW YORK",
"state": "NY",
"departments": [
{"departmentType":"01",
"departmentHead":"Peter"},
{"departmentType":"02",
"departmentHead":"John"}
]
},
{
"_id": "10281",
"city": "LOS ANGELES",
"state": "CA",
"departments": [
{"departmentType":"02",
"departmentHead":"Joan"},
{"departmentType":"03",
"departmentHead":"Mary"}
]
},
{
"_id": "10284",
"city": "MIAMI",
"state": "FL",
"department": [
"departments": [
{"departmentType":"01",
"departmentHead":"George"},
{"departmentType":"02",
"departmentHead":"Harry"}
]
}
I'd like to get a count per departmentType, something like:
[{"departmentType":"01", "dCount":2},
{"departmentType":"02", "dCount":3},
{"departmentType":"03", "dCount":1}
]
For this, I've tried almost everything already, but all examples I find online are easier ones where the group by is done over a field at the root level of the document. Instead, here I'm trying to group by departmentType, and that seems to break everything I found so far.
Any ideas on how to do this using Mongoose's aggregation implementation or mapreduce?
Ideally, I'd like to exclude all departmentTypes with count <= 1 and sort the results by departmentType.
Thank you all in advance!
回答1:
You need to $unwind the departments array which will create a document for each entry in the array so you can aggregate them in the pipeline.
Unfortunately, you can't pre-filter departmentTypes <= 1 because $size will only take a an exact value, but you can filter it out of the results. It's not great, but it works. This example pre-filters only those records with EXACTLY 2 departments, but it's for demo only, you probably want to drop the first $match because we filter out <=1 with the second $match on the results later on;
db.runCommand({
aggregate: "so",
pipeline: [
{ // filter out only records with 2 departments
$match: {
departments: { $size: 2 }
}
},
// unwind - create a doc for each department in the array
{ $unwind: "$departments" },
{ // aggregate sum of departments by type
$group: {
_id: "$departments.departmentType",
count: { $sum: 1 },
}
},
{ // filter out departments with <=1
$match: {
count: { $gt: 1 },
}
},
{ // rename fields as per example
$project: {
_id: 0,
departmentType: "$_id",
dCount: "$count",
}
}
]
});
Note that I've also assumed that your previous json sample has a typo, and "department" doesn't actually exist. This code will work assuming all the documents have the same schema as the first two.
Feel free to drop the first $match, and the last $project if you're not bothered about the actual field names you get.
来源:https://stackoverflow.com/questions/12753440/mapreduce-aggregation-group-by-a-value-in-a-nested-document