问题
I have the following data of a match:
{
date: 20140101,
duration: 23232,
win:[
{
player: "Player1",
score : 2344324
},
{
player: "Player4",
score : 23132
}
],
loss:[
{
player: "Player2",
score : 324
},
{
player: "Player3",
score : 232
}
]
}
Now i want to count the wins and losses for all players like this:
result :
[
{
player : "Player1",
wins : 12,
losses : 2
},
{
player : "Player2",
wins : 7,
losses : 8
}
]
My problem is that the win/loss information only exists in the name of the array.
回答1:
There is a lot in this, especially if you are relatively new to using aggregate, but it can be done. I'll explain the stages after the listing:
db.collection.aggregate([
// 1. Unwind both arrays
{"$unwind": "$win"},
{"$unwind": "$loss"},
// 2. Cast each field with a type and the array on the end
{"$project":{
"win.player": "$win.player",
"win.type": {"$cond":[1,"win",0]},
"loss.player": "$loss.player",
"loss.type": {"$cond": [1,"loss",0]},
"score": {"$cond":[1,["win", "loss"],0]}
}},
// Unwind the "score" array
{"$unwind": "$score"},
// 3. Reshape to "result" based on the value of "score"
{"$project": {
"result.player": {"$cond": [
{"$eq": ["$win.type","$score"]},
"$win.player",
"$loss.player"
] },
"result.type": {"$cond": [
{"$eq":["$win.type", "$score"]},
"$win.type",
"$loss.type"
]}
}},
// 4. Get all unique result within each document
{"$group": { "_id": { "_id":"$_id", "result": "$result" } }},
// 5. Sum wins and losses across documents
{"$group": {
"_id": "$_id.result.player",
"wins": {"$sum": {"$cond": [
{"$eq":["$_id.result.type","win"]},1,0
]}},
"losses": {"$sum":{"$cond": [
{"$eq":["$_id.result.type","loss"]},1,0
]}}
}}
])
Summary
This does take the assumption that the "players" in each "win" and "loss" array are all unique to start with. That seemed reasonable for what appeared to be modeled here:
Unwind both of the arrays. This creates duplicates but they will be removed later.
When projecting there is some usage of the $cond operator (a ternary) in order to get some literal string values. And the last usage is special, because and array is being added. So after projecting that array is going to be unwound again. More duplicates, but that's the point. One "win", one "loss" record for each.
More projection with the $cond operator and the use of the $eq operator as well. This time we are merging the two fields into one. So using this, when the "type" of the field matches the value in "score" then that "key field" is used for the "result" field value. Outcome is the two different "win" and "loss" fields now share the same name, identified by "type".
Getting rid of the duplicates within each document. Simply grouping by the document
_id
and the "result" fields as keys. Now there should be the same "win" and "loss" records as there was in the original document, just in a different form as they are removed from the arrays.Finally group across all documents to get the totals per "player". More usage of $cond and $eq but this time to determine whether the current document is a "win" or a "loss". So where this matches we return 1 and where false we return 0. Those values are passed to $sum in order to get the total counts for "wins" and "losses".
And that explains how to get to the result.
Learn more on the aggregation operators from the documentation. Some of the "funny" usages for $cond in that listing should be able to be replaced with a $literal operator. But that will not be available until version 2.6 and upwards is release.
"Simplified" case for MongoDB 2.6 and upwards
Of course there a new set operators in what is the upcoming release at the time of writing, which will help to simplify this somewhat:
db.collection.aggregate([
{ "$unwind": "$win" },
{ "$project": {
"win.player": "$win.player",
"win.type": { "$literal": "win" },
"loss": 1,
}},
{ "$group": {
"_id" : {
"_id": "$_id",
"loss": "$loss"
},
"win": { "$push": "$win" }
}},
{ "$unwind": "$_id.loss" },
{ "$project": {
"loss.player": "$_id.loss.player",
"loss.type": { "$literal": "loss" },
"win": 1,
}},
{ "$group": {
"_id" : {
"_id": "$_id._id",
"win": "$win"
},
"loss": { "$push": "$loss" }
}},
{ "$project": {
"_id": "$_id._id",
"results": { "$setUnion": [ "$_id.win", "$loss" ] }
}},
{ "$unwind": "$results" },
{ "$group": {
"_id": "$results.player",
"wins": {"$sum": {"$cond": [
{"$eq":["$results.type","win"]},1,0
]}},
"losses": {"$sum":{"$cond": [
{"$eq":["$results.type","loss"]},1,0
]}}
}}
])
But "simplified" is debatable. To me, that just "feels" like it's "chugging around" and doing more work. It certainly is more traditional, as it simply relies on $setUnion to merge the array results.
But that "work" would be nullified by changing your schema a little, as shown here:
{
"_id" : ObjectId("531ea2b1fcc997d5cc5cbbc9"),
"win": [
{
"player" : "Player2",
"type" : "win"
},
{
"player" : "Player4",
"type" : "win"
}
],
"loss" : [
{
"player" : "Player6",
"type" : "loss"
},
{
"player" : "Player5",
"type" : "loss"
},
]
}
And this removes the need to project the array contents by adding the "type" attribute as we have been doing, and reduces the query, and the work done:
db.collection.aggregate([
{ "$project": {
"results": { "$setUnion": [ "$win", "$loss" ] }
}},
{ "$unwind": "$results" },
{ "$group": {
"_id": "$results.player",
"wins": {"$sum": {"$cond": [
{"$eq":["$results.type","win"]},1,0
]}},
"losses": {"$sum":{"$cond": [
{"$eq":["$results.type","loss"]},1,0
]}}
}}
])
And of course just changing your schema as follows:
{
"_id" : ObjectId("531ea2b1fcc997d5cc5cbbc9"),
"results" : [
{
"player" : "Player6",
"type" : "loss"
},
{
"player" : "Player5",
"type" : "loss"
},
{
"player" : "Player2",
"type" : "win"
},
{
"player" : "Player4",
"type" : "win"
}
]
}
That makes things very easy. And this could be done in versions prior to 2.6. So you could do it right now:
db.collection.aggregate([
{ "$unwind": "$results" },
{ "$group": {
"_id": "$results.player",
"wins": {"$sum": {"$cond": [
{"$eq":["$results.type","win"]},1,0
]}},
"losses": {"$sum":{"$cond": [
{"$eq":["$results.type","loss"]},1,0
]}}
}}
])
So for me, if it was my application, I would want the schema in the last form shown above rather than what you have. All of the work done in the supplied aggregation operations (with exception of the last statement) is aimed at taking the existing schema form and manipulating it into this form, so then it is easy to run the simple aggregation statement as shown above.
As each player is "tagged" with the "win/loss" attribute, you can always just discretely access your "winners/loosers" anyhow.
As a final thing. Your date is a string. I don't like that.
There may have been a reason for doing so but I don't see it. If you need to group by day that is easy to do in aggregation just by using a proper BSON date. You will also then be able to easily work with other time intervals.
So if you fixed the date, and made it the start_date, and replaced "duration" with end_time, then you get to keep something that you can get the "duration" from by simple math + You get lots of extra benefits by having these as a date value instead.
So that may give you some food for thought on your schema.
For those who are interested, here is some code I used to generate a working set of data:
// Ye-olde array shuffle
function shuffle(array) {
var m = array.length, t, i;
while (m) {
i = Math.floor(Math.random() * m--);
t = array[m];
array[m] = array[i];
array[i] = t;
}
return array;
}
for ( var l=0; l<10000; l++ ) {
var players = ["Player1","Player2","Player3","Player4"];
var playlist = shuffle(players);
for ( var x=0; x<playlist.length; x++ ) {
var obj = {
player: playlist[x],
score: Math.floor(Math.random() * (100000 - 50 + 1)) +50
};
playlist[x] = obj;
}
var rec = {
duration: Math.floor(Math.random() * (50000 - 15000 +1)) +15000,
date: new Date(),
win: playlist.slice(0,2),
loss: playlist.slice(2)
};
db.game.insert(rec);
}
回答2:
I doubt if this can be done in a single query. This can be done using separate queries for wins and losses like this (for wins):
db.match.aggregate([{$unwind:"$win"}, {$group:{_id:"$win.player", wins:{$sum:1}}}])
来源:https://stackoverflow.com/questions/22301716/mongodb-aggregation-sum-based-on-array-names