Get documents with tags in list, ordered by total number of matches

后端 未结 3 1456
梦谈多话
梦谈多话 2020-12-09 06:36

Given the following MongoDB collection of documents :

{
 title : \'shirt one\'
 tags : [
  \'shirt\',
  \'cotton\',
  \'t-shirt\',
  \'black\'
 ]
},
{
 title         


        
相关标签:
3条回答
  • 2020-12-09 07:05

    As i answered in In MongoDB search in an array and sort by number of matches

    It's possible using Aggregation Framework.

    Assumptions

    • tags attribute is a set (no repeated elements)

    Query

    This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.

    db.test_col.aggregate(
        {$match: {tags: {$in: ["shirt","cotton","black"]}}}, 
        {$unwind: "$tags"}, 
        {$match: {tags: {$in: ["shirt","cotton","black"]}}},
        {$group: {
            _id:{"_id":1}, 
            matches:{$sum:1}
        }}, 
        {$sort:{matches:-1}}
    );
    

    Expected Results

    {
        "result" : [
            {
                "_id" : {
                    "_id" : ObjectId("5051f1786a64bd2c54918b26")
                },
                "matches" : 3
            },
            {
                "_id" : {
                    "_id" : ObjectId("5051f1726a64bd2c54918b24")
                },
                "matches" : 2
            },
            {
                "_id" : {
                    "_id" : ObjectId("5051f1756a64bd2c54918b25")
                },
                "matches" : 1
            }
        ],
        "ok" : 1
    }
    
    0 讨论(0)
  • 2020-12-09 07:05

    Right now, it isnt possible to do unless you use MapReduce. The only problem with MapReduce is that it is slow (compared to a normal query).

    The aggregation framework is slated for 2.2 (so should be available in 2.1 dev release) and should make this sort of thing much easier to do without MapReduce.

    Personally, I do not think using M/R is an efficient way to do it. I would rather query for all the documents and do those calculations on the application side. It is easier and cheaper to scale your app servers than it is to scale your database servers so let the app servers do the number crunching. Of those, this approach may not work for you given your data access patterns and requirements.

    An even simpler approach may be to just include a count property in each of your tag objects and whenever you $push a new tag to the array, you also $inc the count property. This is a common pattern in the MongoDB world, at least until the aggregation framework.

    0 讨论(0)
  • 2020-12-09 07:08

    I'll second @Bryan in saying that MapReduce is the only possible way at the moment (and it's far from perfect). But, in case you desperately need it, here you go :-)

        var m = function() {
            var searchTerms = ['shirt', 'cotton', 'black'];
            var me = this;
            this.tags.forEach(function(t) {
                searchTerms.forEach(function(st) {
                    if(t == st) {
                        emit(me._id, {matches : 1});
                    }
                })
            })
        };
    
        var r = function(k, vals) {
            var result = {matches : 0};
            vals.forEach(function(v) {
                result.matches += v.matches;
            })
            return result;
        };
    
        db.shirts.mapReduce(m, r, {out: 'found01'});
    
        db.found01.find();
    
    0 讨论(0)
提交回复
热议问题