Get all of a collection's documents id's RavenDB for a “per-document” modification

问题

I'm currently trying to update my documents in a ravendb DB. The issue is that i have a method that updates one document, yet it takes as a parameter the id of the doc. I'm using python, therefore : pyravenDB as an interface.

The method is the following :

def updateDocument(self,id,newAttribute)

        with self.store.open_session() as session:
            doc = session.load(id)
            doc.newAttribute= newAttribute
            session.save_changes()

My idea is the i will use a simple for loop with all the id's of the targeted collection and call the updateDocument method.

I think there is an updatebyindex method, but I don't get how to adapt it to my usecase.

How can I obtain this ?

Thanks !

回答1:

Like maqduni said update_by_index is the method you want to use. just create an index that will index the documents you want. if you getting trouble with that you can just try to query on the documents you want and then ravendb will create auto index for you. after your index creation just simply call update_by_index with the index_name and the query (Just make sure the index is not stale)

your code need to look something like this:

from pyravendb.data.indexes import IndexQuery
from pyravendb.data.patches import ScriptedPatchRequest
   self.store.database_commands.update_by_index(index_name="YOUR_INDEX_NAME",
        query=IndexQuery(query="TAG:collection_name;"),
        scripted_patch=ScriptedPatchRequest("this.Attribute = newAttribute;"))

the query in IndexQuery is lucene syntax in the example TAG in the index is all my collection names. the scripted_patch is in js syntax and this is the script that will run on each document you query on.

I will try to explain the different between the two:

the get_index method will give you information about the index the response is the IndexDefinition.

the update_by_index is a long task operation and that why you only get the operation_id you need to wait until it is finished. (will make a feature for that in the next pyravendb version). this operation won't give you the documents that patched. the new feature will give you information about the process.

the page_size is for query results only not for a index operation

回答2:

I'm not a Python expert but with a quick look at the source code of PyRavenDb I could find store.database_commands which are defined in database_commands.py.

The syntax is just like the syntax of the equivalent C# command,

def update_by_index(self, index_name, query, scripted_patch=None, options=None):
    """
    @param index_name: name of an index to perform a query on
    :type str
    @param query: query that will be performed
    :type IndexQuery
    @param options: various operation options e.g. AllowStale or MaxOpsPerSec
    :type BulkOperationOptions
    @param scripted_patch: JavaScript patch that will be executed on query results( Used only when update)
    :type ScriptedPatchRequest
    @return: json
    :rtype: dict
    """
    if not isinstance(query, IndexQuery):
        raise ValueError("query must be IndexQuery Type")
    path = Utils.build_path(index_name, query, options)
    if scripted_patch:
        if not isinstance(scripted_patch, ScriptedPatchRequest):
            raise ValueError("scripted_patch must be ScriptedPatchRequest Type")
        scripted_patch = scripted_patch.to_json()

    response = self._requests_handler.http_request_handler(path, "EVAL", data=scripted_patch)
    if response.status_code != 200 and response.status_code != 202:
        raise response.raise_for_status()
    return response.json()

the function accepts the name of the index, the query which is used to find the to be updated documents, and the JavaScript patch which will modify the documents' data.

If you need to update all the documents of a particular collection consider updating them by Raven/DocumentsByEntityName index. It's a system index who's automatically created and which holds references to all documents in the entire database. So you can write a query that looks for all documents containing a Tag whose value corresponds to the name of your collection, e.g. Query = "Tag:Groups", and pass the query into update_by_index method.

You could also accomplish updating of the documents via batch command which is also defined in database_commands.py and documented here. NOTE: This will be only applicable if you know the ids of the documents.

If you are interested in C# examples you could use a demo project that I created after visiting the RavenDB conference last year in Dallas, https://github.com/maqduni/RavenDb-Demo.

来源：https://stackoverflow.com/questions/42907922/get-all-of-a-collections-documents-ids-ravendb-for-a-per-document-modificati

标签

python

database

ravendb

nosql