Given the following MongoDB collection of documents :
{
title : \'shirt one\'
tags : [
\'shirt\',
\'cotton\',
\'t-shirt\',
\'black\'
]
},
{
title
As i answered in In MongoDB search in an array and sort by number of matches
It's possible using Aggregation Framework.
Assumptions
tags
attribute is a set (no repeated elements)Query
This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.
db.test_col.aggregate(
{$match: {tags: {$in: ["shirt","cotton","black"]}}},
{$unwind: "$tags"},
{$match: {tags: {$in: ["shirt","cotton","black"]}}},
{$group: {
_id:{"_id":1},
matches:{$sum:1}
}},
{$sort:{matches:-1}}
);
Expected Results
{
"result" : [
{
"_id" : {
"_id" : ObjectId("5051f1786a64bd2c54918b26")
},
"matches" : 3
},
{
"_id" : {
"_id" : ObjectId("5051f1726a64bd2c54918b24")
},
"matches" : 2
},
{
"_id" : {
"_id" : ObjectId("5051f1756a64bd2c54918b25")
},
"matches" : 1
}
],
"ok" : 1
}
Right now, it isnt possible to do unless you use MapReduce. The only problem with MapReduce is that it is slow (compared to a normal query).
The aggregation framework is slated for 2.2 (so should be available in 2.1 dev release) and should make this sort of thing much easier to do without MapReduce.
Personally, I do not think using M/R is an efficient way to do it. I would rather query for all the documents and do those calculations on the application side. It is easier and cheaper to scale your app servers than it is to scale your database servers so let the app servers do the number crunching. Of those, this approach may not work for you given your data access patterns and requirements.
An even simpler approach may be to just include a count
property in each of your tag objects and whenever you $push
a new tag to the array, you also $inc
the count
property. This is a common pattern in the MongoDB world, at least until the aggregation framework.
I'll second @Bryan in saying that MapReduce is the only possible way at the moment (and it's far from perfect). But, in case you desperately need it, here you go :-)
var m = function() {
var searchTerms = ['shirt', 'cotton', 'black'];
var me = this;
this.tags.forEach(function(t) {
searchTerms.forEach(function(st) {
if(t == st) {
emit(me._id, {matches : 1});
}
})
})
};
var r = function(k, vals) {
var result = {matches : 0};
vals.forEach(function(v) {
result.matches += v.matches;
})
return result;
};
db.shirts.mapReduce(m, r, {out: 'found01'});
db.found01.find();