Using map/reduce for mapping the properties in a collection

若如初见. 提交于 2019-12-17 03:56:05

问题


Update: follow-up to MongoDB Get names of all keys in collection.

As pointed out by Kristina, one can use Mongodb 's map/reduce to list the keys in a collection:

db.things.insert( { type : ['dog', 'cat'] } );
db.things.insert( { egg : ['cat'] } );
db.things.insert( { type :  [] }); 
db.things.insert( { hello : []  } );

mr = db.runCommand({"mapreduce" : "things",
"map" : function() {
    for (var key in this) { emit(key, null); }
},  
"reduce" : function(key, stuff) { 
   return null;
}}) 

db[mr.result].distinct("_id")

//output: [ "_id", "egg", "hello", "type" ]

As long as we want to get only the keys located at the first level of depth, this works fine. However, it will fail retrieving those keys that are located at deeper levels. If we add a new record:

db.things.insert({foo: {bar: {baaar: true}}})

And we run again the map-reduce +distinct snippet above, we will get:

[ "_id", "egg", "foo", "hello", "type" ] 

But we will not get the bar and the baaar keys, which are nested down in the data structure. The question is: how do I retrieve all keys, no matter their level of depth? Ideally, I would actually like the script to walk down to all level of depth, producing an output such as:

["_id","egg","foo","foo.bar","foo.bar.baaar","hello","type"]      

Thank you in advance!


回答1:


OK, this is a little more complex because you'll need to use some recursion.

To make the recursion happen, you'll need to be able to store some functions on the server.

Step 1: define some functions and put them server-side

isArray = function (v) {
  return v && typeof v === 'object' && typeof v.length === 'number' && !(v.propertyIsEnumerable('length'));
}

m_sub = function(base, value){
  for(var key in value) {
    emit(base + "." + key, null);
    if( isArray(value[key]) || typeof value[key] == 'object'){
      m_sub(base + "." + key, value[key]);
    }
  }
}

db.system.js.save( { _id : "isArray", value : isArray } );
db.system.js.save( { _id : "m_sub", value : m_sub } );

Step 2: define the map and reduce functions

map = function(){
  for(var key in this) {
    emit(key, null);
    if( isArray(this[key]) || typeof this[key] == 'object'){
      m_sub(key, this[key]);
    }
  }
}

reduce = function(key, stuff){ return null; }

Step 3: run the map reduce and look at results

mr = db.runCommand({"mapreduce" : "things", "map" : map, "reduce" : reduce,"out": "things" + "_keys"});
db[mr.result].distinct("_id");

The results you'll get are:

["_id", "_id.isObjectId", "_id.str", "_id.tojson", "egg", "egg.0", "foo", "foo.bar", "foo.bar.baaaar", "hello", "type", "type.0", "type.1"]

There's one obvious problem here, we're adding some unexpected fields here: 1. the _id data 2. the .0 (on egg and type)

Step 4: Some possible fixes

For problem #1 the fix is relatively easy. Just modify the map function. Change this:

emit(base + "." + key, null); if( isArray...

to this:

if(key != "_id") { emit(base + "." + key, null); if( isArray... }

Problem #2 is a little more dicey. You wanted all keys and technically "egg.0" is a valid key. You can modify m_sub to ignore such numeric keys. But it's also easy to see a situation where this backfires. Say you have an associative array inside of a regular array, then you want that "0" to appear. I'll leave the rest of that solution up to you.




回答2:


With Gates VP's and Kristina's answers as inspiration, I created an open source tool called Variety which does exactly this: https://github.com/variety/variety

Hopefully you'll find it to be useful. Let me know if you have questions, or any issues using it.




回答3:


as a simple function;

const getProps = (db, collection) => new Promise((resolve, reject) => {
  db
  .collection(collection)
  .mapReduce(function() {
    for (var key in this) { emit(key, null) }
  }, (prev, next) => null, {
    out: collection + '_keys'
  }, (err, collection_props) => {
    if (err) reject(err)

    collection_props
    .find()
    .toArray()
    .then(
      props => resolve(props.map(({_id}) => _id))
    )
  })
})


来源:https://stackoverflow.com/questions/2997004/using-map-reduce-for-mapping-the-properties-in-a-collection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!