Join two collections with MapReduce in MongoDB

£可爱£侵袭症+ 提交于 2019-12-06 02:36:25

In your problem first_name can be fetched only from Employees collection and dep_name can be fetched only from Departments collection.

You can achieve it both with MapReduce and aggregation framework.

1. MapReduce solution

If you modify your map and reduce functions as follows

var mapD = function() {
  for (var i=0; i<this.employees.length; i++)
    emit(this.employees[i], { dep_id: this._id, dep_name: this.dep_name });  
}

var mapE = function() { emit(this._id, { first_name: this.first_name }); }

var reduceLookUp = function(key, values) {
  var results = {};
  var departments = [];
  values.forEach(function(value) {
    var department = {};
    if (value.dep_id !== undefined) department["dep_id"] = value.dep_id;
    if (value.dep_name !== undefined) department["dep_name"] = value.dep_name;
    if (Object.keys(department).length > 0) departments.push(department);
    if (value.first_name !== undefined) results["first_name"] = value.first_name;
    if (value.departments !== undefined) results["departments"] = value.departments;
  });
  if (Object.keys(departments).length > 0) results["departments"] = departments;
  return results;
}

then first MapReduce call

db.Departments.mapReduce(mapD, reduceLookUp, { out: { reduce: "joined" } });

will insert into joined collection

{ 
  "_id" : "1234", 
  "value" : 
  {
    "departments" : 
    [ 
      { "dep_id" : "d001", "dep_name" : "Sales" }, 
      { "dep_id" : "d004", "dep_name" : "Quality M" } 
    ] 
  }
}

while second call

db.Employees.mapReduce(mapE, reduceLookUp, { out: { reduce: "joined" } });

should insert

{ "_id" : "1234", "value" : { "first_name" : "John" } }

but, according to documentation, reduce output option will

Merge the new result with the existing result if the output collection already exists. If an existing document has the same key as the new result, apply the reduce function to both the new and the existing documents and overwrite the existing document with the result

Thus, reduce function will be called again in your case with parameters

key = "1234",
values =
[
  {
    "departments" : 
    [ 
      { "dep_id" : "d001", "dep_name" : "Sales" }, 
      { "dep_id" : "d004", "dep_name" : "Quality M" } 
    ] 
  },
  { "first_name" : "John" }
]

and final result is

{ 
  "_id" : "1234", 
  "value" : 
  { 
    "first_name" : "John", 
    "departments" : 
    [ 
      { "dep_id" : "d001", "dep_name" : "Sales" }, 
      { "dep_id" : "d004", "dep_name" : "Quality M" }
    ] 
  } 
}

2. Aggregation framework solution

A better solution for your problem is to use aggregation framework instead of Map-Reduce. Here you would use $lookup stage to fetch some data from Employees

db.Departments.aggregate([
  { $unwind: "$employees" },
  { 
    $lookup: 
      { 
        from: "Employees", 
        localField: "employees", 
        foreignField: "_id", 
        as: "employee"
      }
  },
  { $unwind: "$employee" },
  { 
    $group: 
      { 
        "_id": "$employees",
        "first_name": { $first: "$employee.first_name" }, 
        "departments": { $push: { dep_id: "$_id", dep_name: "$dep_name" } } 
      } 
  } 
]);

that will result into

{ 
  "_id" : "1234",
  "first_name" : "John",
  "departments" : 
    [ 
      { "dep_id" : "d001", "dep_name" : "Sales" }, 
      { "dep_id" : "d004", "dep_name" : "Quality M" } 
    ] 
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!