Iterating over every document in Lotus Domino

时光怂恿深爱的人放手 提交于 2019-12-12 13:13:42

问题


I'd like iterate over every document in a (probably big) Lotus Domino database and be able to continue it from the last one if the processing breaks (network connection error, application restart etc.). I don't have write access to the database.

I'm looking for a way where I don't have to download those documents from the server which were already processed. So, I have to pass some starting information to the server which document should be the first in the (possibly restarted) processing.

  1. I've checked the AllDocuments property and the DocumentColletion.getNthDocument method but this property is unsorted so I guess the order can change between two calls.

  2. Another idea was using a formula query but it does not seem that ordering is possible with these queries.

  3. The third idea was the Database.getModifiedDocuments method with a corresponding Document.getLastModified one. It seemed good but it looks to me that the ordering of the returned collection is not documented and based on creation time instead of last modification time.

    Here is a sample code based on the official example:

    System.out.println("startDate: " + startDate);
    final DocumentCollection documentCollection = 
            database.getModifiedDocuments(startDate, Database.DBMOD_DOC_DATA);
    
    Document doc = documentCollection.getFirstDocument();
    while (doc != null) {
        System.out.println("#lastmod: " + doc.getLastModified() + 
                    " #created: " + doc.getCreated());
        doc = documentCollection.getNextDocument(doc);
    }
    

    It prints the following:

    startDate: 2012.07.03 08:51:11 CEDT
    #lastmod: 2012.07.03 08:51:11 CEDT #created: 2012.02.23 10:35:31 CET
    #lastmod: 2012.08.03 12:20:33 CEDT #created: 2012.06.01 16:26:35 CEDT
    #lastmod: 2012.07.03 09:20:53 CEDT #created: 2012.07.03 09:20:03 CEDT
    #lastmod: 2012.07.21 23:17:35 CEDT #created: 2012.07.03 09:24:44 CEDT
    #lastmod: 2012.07.03 10:10:53 CEDT #created: 2012.07.03 10:10:41 CEDT
    #lastmod: 2012.07.23 16:26:22 CEDT #created: 2012.07.23 16:26:22 CEDT
    

    (I don't use any AgentContext here to access the database. The database object comes from a session.getDatabase(null, databaseName) call.)

Is there any way to reliably do this with the Lotus Domino Java API?


回答1:


If you have access to change the database, or could ask someone to do so, then you should create a view that is sorted on a unique key, or modified date, and then just store the "pointer" to the last document processed.

Barring that, you'll have to maintain a list of previously processed documents yourself. In that case you can use the AllDocuments property and just iterate through them. Use the GetFirstDocument and GetNextDocument as they are reportedly faster than GetNthDocument.

Alternatively you could make two passes, one to gather a list of UNIDs for all documents, which you'll store, and then make a second pass to process each document from the list of UNIDs you have (using GetDocumentByUNID method).




回答2:


I don't use the Java API, but in Lotusscript, I would do something like this:

Locate a view displaying all documents in the database. If you want the agent to be really fast, create a new view. The first column should be sorted and could contain the Universal ID of the document. The other columns contains all the values you want to read in your agent, in your example that would be the created date and last modified date.

Your code could then simply loop through the view like this:

lastSuccessful = FunctionToReadValuesSomewhere() ' Returns 0 if empty
Set view = thisdb.GetView("MyLookupView")
Set col = view.AllEntries
Set entry = col.GetFirstEntry
cnt = 0
Do Until entry is Nothing
    cnt = cnt + 1
    If cnt > lastSuccessful Then
        universalID = entry.ColumnValues(0)
        createDate = entry.ColumnValues(1)
        lastmodifiedDate = entry.ColumnValues(2)
        Call YourFunctionToDoStuff(universalID, createDate, lastmodifiedDate)
        Call FunctionToStoreValuesSomeWhere(cnt, universalID)
    End If
    Set entry = col.GetFirstEntry    
Loop
Call FunctionToClearValuesSomeWhere()

Simply store the last successful value and Universal ID in say a text file or environment variable or even profile document in the database. When you restart the agent, have some code that check if the values are blank (then return 0), otherwise return the last successful value.




回答3:


Lotus Notes/Domino databases are designed to be distributed across clients and servers in a replicated environment. In the general case, you do not have a guarantee that starting at a given creation or mod time will bring you consistent results.

If you are 100% certain that no replicas of your target database are ever made, then you can use getModifiedDocuments and then write a sort routine to place (modDateTime,UNID) pairs into a SortedSet or other suitable data structure. Then you can process through the Set, and if you run into an error you can save the modDateTime of the element that you were attempting to process as your restart point. There may be a few additional details for you to work out to avoid duplicates, however, if there are multiple documents with the exact same modDateTime stamp.

I want to make one final remark. I understand that you are asking about Java, but if you are working on a backup or archiving system for compliance purposes, the Lotus C API has special functions that you really should look at.




回答4:


Agents already keep a field to describe documents that they have not yet processed, and these are automatically updated via normal processing.

A better way of doing what you're attempting to do might be to store the results of a search in a profile document. However, if you're trying to relate to documents in a database you do not have write permission to, the only thing you can do is keep a list of the doclinks you've already processed (and any information you need to keep about those documents), or a sister database holding one document for each doclink plus multiple fields related to the processing you've done on them. Then, transfer the lists of IDs and perform the matching on the client to do per-document lookups.



来源:https://stackoverflow.com/questions/13020620/iterating-over-every-document-in-lotus-domino

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!