In CouchDB, are there ways to improve performance of the View index process?

问题

I have some basic views and some map/reduce views with logic. Nothing too complex. Not too many documents. I've tried with 250k, 75k, and 10k documents. Seems like I'm always waiting for view indexing.

Does better, more efficient code in the view help? I'm assuming it's basically processing the view at all levels of aggregation. So there must be some improvement there.

Does emit()-ing less data help? emit(doc.id, doc) vs specifying fewer fields?

Do more or less complex keys impact view indexing?

Or is it all about memory, CPU cores, and processor speed?

There must be some documentation out there, but I can't find anything referencing ways to improve performance.

回答1:

I would take a deeper look into the reduce function. Try to use the built-in Erlang functions like _sum, _count, instead of writing Javascript.

Complex views can take hours and more, that's normal.

Maybe post such not too complex map/reduce.

And don't forget: indexing all docs is only done once after changing the view (or pushing a whole bunch of new docs). Subsequent new docs are indexed incrementally.

Use a view with &stale=ok to retrieve the "old" data instantly, so you don't have to wait. (But pay attention: you always have to call a view without stale=ok at least once to trigger the indexing process). Or better: use stale=update_after.

回答2:

The code you write in views is more like CREATE INDEX than SELECT. It should be irrelevant how long it takes, as long as the view builds keep up with the document change rate. Building a view is a sunk (one-time) cost.

When you query the view, that is always a binary tree scan, which operates against a static data set in logarithmic time. That is usually the performance people care about more (in production.)

If you are not seeing behavior like I describe, perhaps we could discuss your view functions and your general approach to your problem. CouchDB is very different from relational databases. In the latter, you have highly structured data and free-form queries. In CouchDB, you have free-form data but highly structured index definitions (views). Except during development, changing and rebuilding views should be rare.

回答3:

not emitting anything will help, but doing the view creation in smaller batches ( there are scripts that do this automagically ) helps more than anything other than not emitting anything at all, which can't be helped sometimes.

回答4:

If disk speed is your bottleneck, you can always try running CouchDB directly on top of Solid State Disks which have zero latency and high throughput.

Couchappy is a free high performance Couchdb hosting that lets you try the latest releases of Apache CouchDB on top of SSD disks.

来源：https://stackoverflow.com/questions/9236217/in-couchdb-are-there-ways-to-improve-performance-of-the-view-index-process

标签

couchdb