This question is closely related to this one and I will consider the advice given with respect to schema design in a NoSQL context, yet I'm curious to understand this:

Actual questions

Suppose you have the following document:

    _id : 2      abcd
    name : 2     unittest.com
    paths : 4    
        0 : 3    
            path : 2     home
            queries : 4      
                0 : 3    
                    name : 2     query1
                    url : 2      www.unittest.com/home?query1
                    requests: 4

                1 : 3    
                    name : 2     query2
                    url : 2      www.unittest.com/home?query2
                    requests: 4

Basically, I'd like to know

if it is possible to use MongoDB's positional $ operator (details) multiple times, or put differently, in update scenarios that involve array/document structures with a "degree of nestedness" greater than 1:

{ <update operator>: { "paths.$.queries.$.requests" : value } } (doesn't work)

instead of "only" be able to use $ once for a top-level array and being bound to use explicit indexes for arrays on "higher levels":

{ <update operator>: { "paths.$.queries.0.requests" : value } }) (works)
if possible at all, how the corresponding R syntax would look like.

Below you'll find a reproducible example. I tried to be as concise as possible.

Code example

Database connection

require("rmongodb")
db  <- "__unittest" 
ns  <- paste(db, "hosts", sep=".")
# CONNCETION OBJECT
con <- mongo.create(db=db)
# ENSURE EMPTY DB
mongo.remove(mongo=con, ns=ns)

Example document

q <- list("_id"="abcd")
b <- list("_id"="abcd", name="unittest.com")
mongo.insert(mongo=con, ns=ns, b=b)
q <- list("_id"="abcd")
b <- list("$push"=list(paths=list(path="home")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
q <- list("_id"="abcd", paths.path="home")
b <- list("$push"=list("paths.$.queries"=list(
    name="query1", url="www.unittest.com/home?query1")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)
b <- list("$push"=list("paths.$.queries"=list(
    name="query2", url="www.unittest.com/home?query2")))
mongo.update(mongo=con, ns, criteria=q, objNew=b)

Update of nested arrays with explicit position index (works)

This works, but it involves an explicit index for the second-level array queries (nested in a subdoc element of array paths):

q <- list("_id"="abcd", paths.path="home", paths.queries.name="query1")
b <- list("$push"=list("paths.$.queries.0.requests"=list(time="2013-02-13")))
> mongo.bson.from.list(b)
    $push : 3    
        paths.$.queries.0.requests : 3   
            time : 2     2013-02-13

mongo.update(mongo=con, ns, criteria=q, objNew=b)
res <- mongo.find.one(mongo=con, ns=ns, query=q)
> res
    _id : 2      abcd
    name : 2     unittest.com
    paths : 4    
        0 : 3    
            path : 2     home
            queries : 4      
                0 : 3    
                    name : 2     query1
                    requests : 4     
                        0 : 3    
                            time : 2     2013-02-13


                    url : 2      www.unittest.com/home?query1

                1 : 3    
                    name : 2     query2
                    url : 2      www.unittest.com/home?query2

Update of nested arrays with positional `$` indexes (doesn't work)

Now, I'd like to substitute the explicit 0 with the positional $ operator just like I did in order to have the server find the desired subdoc element of array paths (paths.$.queries).

AFAIU the documentation, this should work as the crucial thing is to specify a "correct" query selector:

The positional $ operator, when used with the update() method and acts as a placeholder for the first match of the update query selector:

I think I specified a query selector that does find the correct nested element (due to the paths.queries.name="query1" part):

q <- list("_id"="abcd", paths.path="home", paths.queries.name="query1")

I guess translated to "plain MongoDB" syntax, the query selector looks somewhat like this

{ _id: abcd, paths.path: home, paths.queries.name: query1 }

which seems like a valid query selector to me. In fact it does match the desired element/doc:

> !is.null(mongo.find.one(mongo=con, ns=ns, query=q))
[1] TRUE

My thought was that if it works on the top-level, why shouldn't it work for higher levels as well (as long as the query selector points to the right nested components)?

However, the server doesn't seem to like a nested or multiple use of $:

b <- list("$push"=list("paths.$.queries.$.requests"=list(time="2013-02-14")))
> mongo.bson.from.list(b)
    $push : 3    
        paths.$.queries.$.requests : 3   
            time : 2     2013-02-14

> mongo.update(mongo=con, ns, criteria=q, objNew=b)
[1] FALSE

I'm not sure if it doesn't work because MongoDB doesn't support this or if I didn't get the R syntax right.

The positional operator only supports one level deep and only the first matching element.

There is a JIRA trackable for the sort of behaviour you want here: https://jira.mongodb.org/browse/SERVER-831

I am unsure if it will allow for more than one match but I believe it will due to the dynamics of how it will need to work.

In case you can execute your query from the MongoDB shell you can bypass this limitation by taking advantage of MongoDB cursor's forEach function (http://docs.mongodb.org/manual/reference/method/cursor.forEach/)

Here is an example with 3 nested arrays:

var collectionNameCursor = db.collection_name.find({...});

collectionNameCursor.forEach(function(collectionDocument) {
    var firstArray = collectionDocument.firstArray;
    for(var i = 0; i < firstArray.length; i++) {
        var secondArray = firstArray[i].secondArray;
        for(var j = 0; j < secondArray.length; j++) {
            var thirdArray = secondArray[j].thirdArray;
            for(var k = 0; k < thirdArray.length; k++) {
                //... do some logic here with thirdArray's elements
                db.collection_name.save(collectionDocument);
            }
        }
    }
});

Note that this is more of a one time solution then a production code but it's going to do the job if you have to write a fix-up script.

来源：https://stackoverflow.com/questions/14855246/multiple-use-of-the-positional-operator-to-update-nested-arrays

标签

mongodb

rmongodb

nosql

Multiple use of the positional `$` operator to update nested arrays