Use MongoDB aggregation to find set intersection of two sets within the same document

纵然是瞬间 提交于 2020-01-15 09:07:45

问题


I'm trying to use the Mongo aggregation framework to find where there are records that have different unique sets within the same document. An example will best explain this:

Here is a document that is not my real data, but conceptually the same:

db.house.insert(
{
    houseId : 123,
    rooms: [{ name : 'bedroom',
          owns : [
            {name : 'bed'},
            {name : 'cabinet'}
        ]},
        { name : 'kitchen',
          owns : [
            {name : 'sink'},
            {name : 'cabinet'}
        ]}],
       uses : [{name : 'sink'},
               {name : 'cabinet'},
               {name : 'bed'},
               {name : 'sofa'}]
}
)

Notice that there are two hierarchies with similar items. It is also possible to use items that are not owned. I want to find documents like this one: where there is a house that uses something that it doesn't own.

So far I've built up the structure using the aggregate framework like below. This gets me to 2 sets of distinct items. However I haven't been able to find anything that could give me the result of a set intersection. Note that a simple count of set size will not work due to something like this: ['couch', 'cabinet'] compare to ['sofa', 'cabinet'].

{'$unwind':'$uses'}
{'$unwind':'$rooms'}
{'$unwind':'$rooms.owns'}
{'$group' : {_id:'$houseId', 
             use:{'$addToSet':'$uses.name'}, 
             own:{'$addToSet':'$rooms.owns.name'}}}

produces:

{ _id : 123,
  use : ['sink', 'cabinet', 'bed', 'sofa'],
  own : ['bed', 'cabinet', 'sink']
}

How do I then find the set intersection of use and own in the next stage of the pipeline?


回答1:


You were not very far from the full solution with aggregation framework - you needed one more thing before the $group step and that is something that would allow you to see if all the things that are being used match up with something that is owned.

Here is the full pipeline

> db.house.aggregate(
       {'$unwind':'$uses'}, 
       {'$unwind':'$rooms'}, 
       {'$unwind':'$rooms.owns'}, 
       {$project:  { _id:0, 
                     houseId:1, 
                     uses:"$uses.name", 
                     isOkay:{$cond:[{$eq:["$uses.name","$rooms.owns.name"]}, 1, 0]}
                   }
       }, 
       {$group: { _id:{house:"$houseId",item:"$uses"}, 
                  hasWhatHeUses:{$sum:"$isOkay"}
                }
       },
       {$match:{hasWhatHeUses:0}})

and its output on your document

{
    "result" : [
        {
            "_id" : {
                "house" : 123,
                "item" : "sofa"
            },
            "hasWhatHeUses" : 0
        }
    ],
    "ok" : 1
}

Explanation - once you unwrap both arrays you now want to flag the elements where used item is equal to owned item and give them a non-0 "score". Now when you regroup things back by houseId you can check if any used items didn't get a match. Using 1 and 0 for score allows you to do a sum and now a match for item which has sum 0 means it was used but didn't match anything in "owned". Hope you enjoyed this!




回答2:


So here is a solution not using the aggregation framework. This uses the $where operator and javascript. This feels much more clunky to me, but it seems to work so I wanted to put it out there if anyone else comes across this question.

db.houses.find({'$where': 
    function() { 
        var ownSet = {};
        var useSet = {};
        for (var i=0;i<obj.uses.length;i++){ 
            useSet[obj.uses[i].name] = true;
        }
        for (var i=0;i<obj.rooms.length;i++){ 
            var room = obj.rooms[i];
            for (var j=0;j<room.owns.length;j++){
                ownSet[room.owns[j].name] = true;
            }
        }
        for (var prop in ownSet) {
            if (ownSet.hasOwnProperty(prop)) {
                if (!useSet[prop]){
                    return true;
                }
            }
        }
        for (var prop in useSet) {
            if (useSet.hasOwnProperty(prop)) {
                if (!ownSet[prop]){
                    return true;
                }
            }
        }
        return false
    }
})



回答3:


For MongoDB 2.6+ Only

As of MongoDB 2.6, there are set operations available in the project pipeline stage. The way to answer this problem with the new operations is:

db.house.aggregate([
    {'$unwind':'$uses'},
    {'$unwind':'$rooms'},
    {'$unwind':'$rooms.owns'},
    {'$group' : {_id:'$houseId', 
             use:{'$addToSet':'$uses.name'}, 
             own:{'$addToSet':'$rooms.owns.name'}}},
    {'$project': {int:{$setIntersection:["$use","$own"]}}}
]);


来源:https://stackoverflow.com/questions/17264017/use-mongodb-aggregation-to-find-set-intersection-of-two-sets-within-the-same-doc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!