问题

This question is closely related to this one and I will consider the advice given with respect to schema design in a NoSQL context, yet I\'m curious to understand this:

Actual questions

Suppose you have the following document:

    _id : 2      abcd
    name : 2     unittest.com
    paths : 4    
        0 : 3    
            path : 2     home
            queries : 4      
                0 : 3    
                    name : 2     query1
                    url : 2      www.unittest.com/home?query1
                    requests: 4

                1 : 3    
                    name : 2     query2
                    url : 2      www.unittest.com/home?query2
                    requests: 4

Basically, I\'d like to know

if it is possible to use MongoDB\'s positional $ operator (details) multiple times, or put differently, in update scenarios that involve array/document structures with a \"degree of nestedness\" greater than 1:

{ <update operator>: { \"paths.$.queries.$.requests\" : value } } (doesn\'t work)

instead of \"only\" be able to use $ once for a top-level array and being bound to use explicit indexes for arrays on \"higher levels\":

{ <update operator>: { \"paths.$.queries.0.requests\" : value } }) (works)
if possible at all, how the corresponding R syntax would look like.

Below you\'ll find a reproducible example. I tried to be as concise as possible.

Code example

Database connection

require(\"rmongodb\")
db  <- \"__unittest\" 
ns  <- paste(db, \"hosts\", sep=\".\")
# CONNCETION OBJECT
con <- mongo.create(db=db)
# ENSURE EMPTY DB
mongo.remove(mongo=con, ns=ns)

Example document

q <- list(\"_id\"=\"abcd\")
b <- list(\"_id\"=\"abcd\", name=\"unittest.com\")
mongo.insert(mongo=con, ns=ns, b=b)
q <- list(\"_id\"=\"abcd\")
b <- list(\"$push\"=list(paths=list(path=\"home\")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
q <- list(\"_id\"=\"abcd\", paths.path=\"home\")
b <- list(\"$push\"=list(\"paths.$.queries\"=list(
    name=\"query1\", url=\"www.unittest.com/home?query1\")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
b <- list(\"$push\"=list(\"paths.$.queries\"=list(
    name=\"query2\", url=\"www.unittest.com/home?query2\")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)

Update of nested arrays with explicit position index (works)

This works, but it involves an explicit index for the second-level array queries (nested in a subdoc element of array paths):

q <- list(\"_id\"=\"abcd\", paths.path=\"home\", paths.queries.name=\"query1\")
b <- list(\"$push\"=list(\"paths.$.queries.0.requests\"=list(time=\"2013-02-13\")))
> mongo.bson.from.list(b)
    $push : 3    
        paths.$.queries.0.requests : 3   
            time : 2     2013-02-13

mongo.update(mongo=con, ns, criteria=q, objNew=b)
res <- mongo.find.one(mongo=con, ns=ns, query=q)
> res
    _id : 2      abcd
    name : 2     unittest.com
    paths : 4    
        0 : 3    
            path : 2     home
            queries : 4      
                0 : 3    
                    name : 2     query1
                    requests : 4     
                        0 : 3    
                            time : 2     2013-02-13


                    url : 2      www.unittest.com/home?query1

                1 : 3    
                    name : 2     query2
                    url : 2      www.unittest.com/home?query2

Update of nested arrays with positional `$` indexes (doesn\'t work)

Now, I\'d like to substitute the explicit 0 with the positional $ operator just like I did in order to have the server find the desired subdoc element of array paths (paths.$.queries).

AFAIU the documentation, this should work as the crucial thing is to specify a \"correct\" query selector:

The positional $ operator, when used with the update() method and acts as a placeholder for the first match of the update query selector:

I think I specified a query selector that does find the correct nested element (due to the paths.queries.name=\"query1\" part):

q <- list(\"_id\"=\"abcd\", paths.path=\"home\", paths.queries.name=\"query1\")

I guess translated to \"plain MongoDB\" syntax, the query selector looks somewhat like this

{ _id: abcd, paths.path: home, paths.queries.name: query1 }

which seems like a valid query selector to me. In fact it does match the desired element/doc:

> !is.null(mongo.find.one(mongo=con, ns=ns, query=q))
[1] TRUE

My thought was that if it works on the top-level, why shouldn\'t it work for higher levels as well (as long as the query selector points to the right nested components)?

However, the server doesn\'t seem to like a nested or multiple use of $:

b <- list(\"$push\"=list(\"paths.$.queries.$.requests\"=list(time=\"2013-02-14\")))
> mongo.bson.from.list(b)
    $push : 3    
        paths.$.queries.$.requests : 3   
            time : 2     2013-02-14

> mongo.update(mongo=con, ns, criteria=q, objNew=b)
[1] FALSE

I\'m not sure if it doesn\'t work because MongoDB doesn\'t support this or if I didn\'t get the R syntax right.

回答1:

The positional operator only supports one level deep and only the first matching element.

There is a JIRA trackable for the sort of behaviour you want here: https://jira.mongodb.org/browse/SERVER-831

I am unsure if it will allow for more than one match but I believe it will due to the dynamics of how it will need to work.

回答2:

In case you can execute your query from the MongoDB shell you can bypass this limitation by taking advantage of MongoDB cursor's forEach function (http://docs.mongodb.org/manual/reference/method/cursor.forEach/)

Here is an example with 3 nested arrays:

var collectionNameCursor = db.collection_name.find({...});

collectionNameCursor.forEach(function(collectionDocument) {
    var firstArray = collectionDocument.firstArray;
    for(var i = 0; i < firstArray.length; i++) {
        var secondArray = firstArray[i].secondArray;
        for(var j = 0; j < secondArray.length; j++) {
            var thirdArray = secondArray[j].thirdArray;
            for(var k = 0; k < thirdArray.length; k++) {
                //... do some logic here with thirdArray's elements
                db.collection_name.save(collectionDocument);
            }
        }
    }
});

Note that this is more of a one time solution then a production code but it's going to do the job if you have to write a fix-up script.

来源：https://stackoverflow.com/questions/14855246/multiple-use-of-the-positional-operator-to-update-nested-arrays

标签

mongodb

rmongodb

nosql

Multiple use of the positional `$` operator to update nested arrays

问题