How to search comma separated data in mongodb

陌路散爱 提交于 2020-01-02 07:09:44

问题


I have movie database with different fields. the Genre field contains a comma separated string like :

{genre: 'Action, Adventure, Sci-Fi'}

I know I can use regular expression to find the matches. I also tried:

{'genre': {'$in': genre}}

the problem is the running time. it take lot of time to return a query result. the database has about 300K documents and I have done normal indexing over 'genre' field.


回答1:


Would say use Map-Reduce to create a separate collection that stores the genre as an array with values coming from the split comma separated string, which you can then run the Map-Reduce job and administer queries on the output collection.

For example, I've created some sample documents to the foo collection:

db.foo.insert([
    {genre: 'Action, Adventure, Sci-Fi'},
    {genre: 'Thriller, Romantic'},
    {genre: 'Comedy, Action'}
])

The following map/reduce operation will then produce the collection from which you can apply performant queries:

map = function() {
    var array = this.genre.split(/\s*,\s*/);
    emit(this._id, array);
}

reduce = function(key, values) {
    return values;
}

result = db.runCommand({
    "mapreduce" : "foo", 
    "map" : map,
    "reduce" : reduce,
    "out" : "foo_result"
});

Querying would be straightforward, leveraging the queries with an multi-key index on the value field:

db.foo_result.createIndex({"value": 1});

var genre = ['Action', 'Adventure'];
db.foo_result.find({'value': {'$in': genre}})

Output:

/* 0 */
{
    "_id" : ObjectId("55842af93cab061ff5c618ce"),
    "value" : [ 
        "Action", 
        "Adventure", 
        "Sci-Fi"
    ]
}

/* 1 */
{
    "_id" : ObjectId("55842af93cab061ff5c618d0"),
    "value" : [ 
        "Comedy", 
        "Action"
    ]
}



回答2:


Well you cannot really do this efficiently so I'm glad you used the tag "performance" on your question.

If you want to do this with the "comma separated" data in a string in place you need to do this:

Either with a regex in general if it suits:

db.collection.find({ "genre": { "$regex": "Sci-Fi" } })

But not really efficient.

Or by JavaScript evaluation via $where:

db.collection.find(function() {
     return ( 
         this.genre.split(",")
             .map(function(el) { 
                 return el.replace(/^\s+/,"") 
             })
             .indexOf("Sci-Fi") != -1;
    )
})

Not really efficient and probably equal to above.

Or better yet and something that can use an index, the separate to an array and use a basic query:

{
    "genre": [ "Action", "Adventure", "Sci-Fi" ] 
}

With an index:

db.collection.ensureIndex({ "genre": 1 })

Then query:

db.collection.find({ "genre": "Sci-Fi" })

Which is when you do it that way it's that simple. And really efficient.

You make the choice.



来源:https://stackoverflow.com/questions/30940908/how-to-search-comma-separated-data-in-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!