How do I load a random document from CouchDB (efficiently and fairly)?

前端 未结 5 802
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-03 10:46

I would like to load a random document out of a set of documents stored in a CouchDB database. The method for picking and loading the document should conform to the following re

5条回答
  •  不要未来只要你来
    2021-02-03 11:17

    Approach 2b: Sequential Number in Document

    This approach is similar to Approach 2 mentioned in this answer . That Approach 2 uses random numbers twice (once in the document itself and once in the process of picking a document). This Approach 2b will only use random numbers on the picking process and use sequential integers on the documents. Note that it will not work, if documents are deleted (see below). Here is how it works:

    Add sequential integers to your documents at creation time:

    {
        _id: "4f12782c39474fd0a498126c0400708c",
        int_id : 0,
        // actual data...
    }
    

    another doc

    {
        _id: "a498126c0400708c4f12782c39474fd0",
        int_id : 1,
        // actual data...
    }
    

    and just count up by one with each document.

    The view random has the same map function (although you might want to change its name to something other than "random"):

     function(doc) {
       if (doc.int_id) {
         emit(doc.int_id, doc);
       }
     }  
    

    These are the steps for loading a random document:

    • Find the total number of documents N in the view by calling:
      http://localhost:5984/db/_design/d/_view/random
    • Pick random number 0 <= r < 1
    • Calculate random index: i = floor(r*N)
    • Load the document:
      http://localhost:5984/db/_design/d/_view/random?startkey=i&limit=1

    This way we chose an even distribution of int_id from 0 to N-1 by design. Then, we pick a random index (between 0 and N-1) and use it on that even distribution.

    Note

    This approach does not work anymore, when documents in the middle or at the beginning are deleted. The int_id has to start at 0 and go up to N-1.

提交回复
热议问题