Wrong count of documents in MongoDB shard cluster

问题

I have a cluster with three shards using MongoDB 4.2. I have a collection (users) that, before sharding can be checked it has 600000 documents:

mongos> db.users.count()
600000

Next, I shard it with the usual commands (first DB, next collection):

mongos> sh.enableSharding("app")
mongos> sh.shardCollection("app.users", {"name.first": 1})

getting after a couple of minutes or so an equally distrution of chunks amongs the shards:

chunks:
    shard0000   3
    shard0001   2
    shard0002   3

So far so good.

However, if I get a count just after this, I get a weird value, higher than the number of documents in the collection:

mongos> db.users.count()
994243
mongos> db.users.find({}).count()
994243

Moreover, the getShardDistribution() result on the collection is also weird, showing the total number of document all them in one of the shards (which makes no sense, as part of them have been distributed in the other two shards):

mongos> db.users.getShardDistribution()

Shard shard0000 at localhost:27018
 data : 95.85MiB docs : 236611 chunks : 3
 estimated data per chunk : 31.95MiB
 estimated docs per chunk : 78870

Shard shard0001 at localhost:27019
 data : 64.06MiB docs : 157632 chunks : 2
 estimated data per chunk : 32.03MiB
 estimated docs per chunk : 78816

Shard shard0002 at localhost:27020
 data : 243.69MiB docs : 600000 chunks : 3
 estimated data per chunk : 81.23MiB
 estimated docs per chunk : 200000

Totals
 data : 403.62MiB docs : 994243 chunks : 8
 Shard shard0000 contains 23.74% data, 23.79% docs in cluster, avg obj size on shard : 424B
 Shard shard0001 contains 15.87% data, 15.85% docs in cluster, avg obj size on shard : 426B
 Shard shard0002 contains 60.37% data, 60.34% docs in cluster, avg obj size on shard : 425B

Interestingly, if I wait a while (not sure how much, but not more than 30 minutes), count and getShardDistribution() are back to normality:

mongos> db.users.count()
600000
mongos> db.users.getShardDistribution()

Shard shard0001 at localhost:27019
 data : 64.06MiB docs : 157632 chunks : 2
 estimated data per chunk : 32.03MiB
 estimated docs per chunk : 78816

Shard shard0002 at localhost:27020
 data : 83.77MiB docs : 205757 chunks : 3
 estimated data per chunk : 27.92MiB
 estimated docs per chunk : 68585

Shard shard0000 at localhost:27018
 data : 95.85MiB docs : 236611 chunks : 3
 estimated data per chunk : 31.95MiB
 estimated docs per chunk : 78870

Totals
 data : 243.69MiB docs : 600000 chunks : 8
 Shard shard0001 contains 26.28% data, 26.27% docs in cluster, avg obj size on shard : 426B
 Shard shard0002 contains 34.37% data, 34.29% docs in cluster, avg obj size on shard : 426B
 Shard shard0000 contains 39.33% data, 39.43% docs in cluster, avg obj size on shard : 424B

Why is this happening? How I can avoid this effect? (maybe forcing somekind of sync with a command?)

Thanks!

PD: In the case it may be relevant, I'm using a testing environment setup, which uses a standalone mongod process to implement each shard. The config server uses a mono-node replica set configuration.

回答1:

count provides an estimated count, and may not be accurate. Use countDocuments to get an accurate count.

You can read the source of getShardDistribution by typing db.users.getShardDistribution in the shell. It seems to use information stored in the config database.

It is quite reasonable to expect that the statistics stored by the database aren't exactly accurate. This is because there is a cost to have them be up-to-date whenever any operation is performed anywhere in the cluster.

You seem to be looking at statistics at a point in time after some chunks have been copied from one shard to another and before these chunks are removed from the original shard. In this situation the data is stored twice in the cluster. The statistics aren't accurate in this case. To obtain an accurate count, use countDocuments.

来源：https://stackoverflow.com/questions/61852644/wrong-count-of-documents-in-mongodb-shard-cluster

标签

mongodb