What is the best document storage strategy in NoSQL databases?

后端 未结 2 1345
忘掉有多难
忘掉有多难 2021-01-07 10:37

NoSQL databases like Couchbase do hold a lot of documents in memory, hence their enormous speed but it\'s also putting a greater demand on the memory size of the server(s) i

相关标签:
2条回答
  • 2021-01-07 11:06

    I do agree with your technique on the efficient use of resources (if they are limited). But on the flip side, the system might end up being very chatty. If I understand correctly, your "connections" document design is too granular and may introduce too many I/Os across the network. In my experience, these network I/Os are very expensive, if you are designing a system that makes real time decisions. You may mathematically estimate the impacts of these different choices to balance these opposing forces :)

    I do think that the spirit of the scalable big data systems is that we shall worry "less" about the resource "constraints". These no-sql database licenses do not go by CPU cores. Commodity hardware is cheap. RAM is getting cheaper as we are discussing. Once again, the return of investment of these systems would also impact the architectural decisions.

    0 讨论(0)
  • 2021-01-07 11:14

    Thank you for updating your original question. You are correct when you talking about finding a right balance between coarse grained documents vs. fine grained.

    The final architecture of the documents actually falls under your particular business domain needs. You have to identify in your use cases "chunks" of data that are needed as a whole and then base your stored documents shape on this. Here are some high level steps you need to perform when you design your documents structure:

    1. Identify all document consumption use cases for your app/service. (read, read-write, searchable items)
    2. Design your documents (most likely you will end up with several smaller documents vs one big doc that has everything)
    3. Design your document keys that can coexists in one bucket for different documents types (e.g. use namespace in the key value)
    4. Do "dry run" of the resulting model against your use cases to see of you have optimal (read/write) transactions to noSQL and all required document data with in the transaction.
    5. Run performance testing for your use cases (try simulate the expected load at least 2 times higher)

    Note: When you design different docs its OK to have some sort of redundancy (remember its not RDBMS with normalized form) think of it more as Object Oriented Design.

    Note2: If you have searchable items that outside of your keys (e.g. search customers by last name "starts with" and some other dynamic search criteria) consider using ElasticSearch integration with CB or you can also try N1QL query language that is coming with CB3.0.

    It seems that you going in a right direction by splitting into several smaller documents all linked by a MSISDN e.g.: MSISDN:profile, MSISDN:revenue, MSISDN:optin. I would pay special attention to your last document type "A/B" connection. That sounds like it might generate large volume and in nature transient...so you have to find out how long these documents have to live in Couchbase bucket. You can specify TTL (time to live) so that old docs will be auto-cleared up.

    0 讨论(0)
提交回复
热议问题