MongoDB: does document size affect query performance?

后端未结

关注

 4  1901

不思量自难忘° 2021-01-31 16:31

Assume a mobile game that is backed by a MongoDB database containing a User collection with several million documents.

Now assume several dozen properties t

4条回答

暗喜 (楼主)

2021-01-31 17:10
Short answer: yes.

Long answer: how it will affect the queries depends on many factors, like the nature of the queries, the memory available and the indices sizes.

The best you can do is testing.

The code bellow will generate two collections named smallDocuments and bigDocuments, with 1024 documents each, being different only by a field 'c' containing a big string and the _id. The bigDocuments collection will have about 2GB, so be careful running it.
```
const numberOfDocuments = 1024;

// 2MB string x 1024 ~ 2GB collection
const bigString = 'a'.repeat(2 * 1024 * 1024);

// generate and insert documents in two collections: shortDocuments and
// largeDocuments;
for (let i = 0; i < numberOfDocuments; i++) {
  let doc = {};
  // field a: integer between 0 and 10, equal in both collections;
  doc.a = ~~(Math.random() * 10);

  // field b: single character between a to j, equal in both collections;
  doc.b = String.fromCharCode(97 + ~~(Math.random() * 10));

  //insert in smallDocuments collection
  db.smallDocuments.insert(doc);

  // field c: big string, present only in bigDocuments collection;
  doc.c = bigString;

  //insert in bigDocuments collection
  db.bigDocuments.insert(doc);
}
```
You can put this code in a file (e.g. create-test-data.js) and run it directly in the mongoshell, typing this command:

mongo testDb < create-test-data.js

It will take a while. After that you can execute some test queries, like these ones:
```
const numbersToQuery = [];

// generate 100 random numbers to query documents using field 'a':
for (let i = 0; i < 100; i++) {
  numbersToQuery.push(~~(Math.random() * 10));
}

const smallStart = Date.now();
numbersToQuery.forEach(number => {
  // query using inequality conditions: slower than equality
  const docs = db.smallDocuments
    .find({ a: { $ne: number } }, { a: 1, b: 1 })
    .toArray();
});
print('Small:' + (Date.now() - smallStart) + ' ms');

const bigStart = Date.now();
numbersToQuery.forEach(number => {
  // repeat the same queries in the bigDocuments collection; note that the big field 'c'
  // is ommited in the projection
  const docs = db.bigDocuments
    .find({ a: { $ne: number } }, { a: 1, b: 1 })
    .toArray();
});
print('Big: ' + (Date.now() - bigStart) + ' ms');
```
Here I got the following results:

Without index:
```
Small: 1976 ms
Big: 19835 ms
```
After indexing field 'a' in both collections, with .createIndex({ a: 1 }):
```
Small: 2258 ms
Big: 4761 ms
```
This demonstrates that queries on big documents are slower. Using index, the result time from bigDocuments is more than 100% bigger than in smallDocuments.

My sugestions are:
1. Use equality conditions in queries (https://docs.mongodb.com/manual/core/query-optimization/index.html#query-selectivity);
2. Use covered queries (https://docs.mongodb.com/manual/core/query-optimization/index.html#covered-query);
3. Use indices that fit in memory (https://docs.mongodb.com/manual/tutorial/ensure-indexes-fit-ram/);
4. Keep documents small;
5. If you need phrase queries using text indices, make sure the entire collection fits in memory (https://docs.mongodb.com/manual/core/index-text/#storage-requirements-and-performance-costs, last bullet);
6. Generate test data and make test queries, simulating your app use case; use random strings generators if needed.
I had problems with text queries in big documents, using MongoDB: Autocomplete and text search memory issues in apostrophe-cms: need ideas

Here there is some code I wrote to generate sample data, in ApostropheCMS, and some test results: https://github.com/souzabrs/misc/tree/master/big-pieces.

This is more a database design issue than a MongoDB internal one. I think MongoDB was made to behave this way. But, it would help a lot to have more obvious explanation in its documentation.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

MongoDB: does document size affect query performance?

My sugestions are: