CouchDB - Filtered Replication - Can the speed be improved?

后端未结

关注

 1  1523

I have a single database (300MB & 42,924 documents) consisting of about 20 different kinds of documents from about 200 users. The documents range in size from a few byte

相关标签:

1条回答

梦谈多话

2020-12-16 21:23
Filtered replications works slow because for each fetched document runs complex logic to decide whether to replicate it or not:
1. CouchDB fetches next document;
2. Because filter function has to be applied the document gets converted to JSON;
3. JSONifyed document passes through stdio to query server;
4. Query server handles document and decodes it from JSON;
5. Now, query server lookups and runs your filter function which returns true or false value to CouchDB;
6. If result is true document goes to be replicated;
7. Go to p.1 and loop for all documents;
For non-filtered replications take this list, throw away p.2-5 and let p.6 has always true result. This overhead slows down whole replication process.

To significantly improve filtered replication speed, you may use Erlang filters via Erlang native server. They runs inside CouchDB, doesn't pass through any stdio interface and there is no JSON decode/encode overhead applied.

NOTE, that Erlang query server runs not inside sandbox like JavaScript one, so you need to really trust code that you run with it.

Another option is to optimize your filter function e.g. reduce object creation, method calls, but actually you wouldn't win much with this.
0 讨论(0)
发布评论:

提交评论
- 加载中...