Sharding key (MongoDB) for large number documents

孤者浪人 提交于 2019-12-11 05:22:57

问题


I am developing a web application where users will be uploading a large number of documents to the system and different types of operations will be performed on the documents, including aggregation. However the number of documents uploaded by each user varies widely - some might upload a dozen documents, and some might upload a million documents.

documents look something like this:

doc{
    _id: <self generated UUID>,
    uid: <id of user who uploaded the document>,
    ctime: <creation timestamp>,
    ....
        <other attributes, etc>
    ....
}

Now here is the problem in choosing the shard key:
1. If I choose the UUID as the shard key, documents uploaded by the same user are unlikely to end up in the same shard and aggregation operations will be costly.
2. If I use uid as the shard key then the data stored in shards will not be even.

Can anyone suggest which is the best way to achieve this?

I am very new to partitioning and sharding and my research on google as well as stack-overflow did not yield anything. I can change the schema of the documents if needed since the project is still at the design phase.


回答1:


This is the best guide I've seen on choosing a shard key: http://www.kchodorow.com/blog/2011/01/04/how-to-choose-a-shard-key-the-card-game/

You have to decide how you want to query the data. Perhaps a combination of uid and ctime will yield a good shard key, but I'm not sure if that will cause you grief while querying, as you haven't given much insight on how you plan to query.




回答2:


You can read more on shardkey selection and scaling

1] Kristina Chodrow's book "Scaling MongoDB" http://shop.oreilly.com/product/0636920018308.do

2]Antoine Girbal's presentation on Sharding Best Practices http://www.10gen.com/presentations/MongoNYC-2012/Sharding-Best-Practices-Advanced



来源:https://stackoverflow.com/questions/11251356/sharding-key-mongodb-for-large-number-documents

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!