cost of keys in JSON document database (mongodb, elasticsearch)

柔情痞子 提交于 2019-12-10 11:59:16

问题


I would like if someone had any experience with speed or optimization effects on the size of JSON keys in a document store database like mongodb or elasticsearch.

So for example: I have 2 documents

doc1: { keeeeeey1: 'abc', keeeeeeey2: 'xyz')

doc2: { k1: 'abc', k2: 'xyz')

Lets say I have 10 million records, then to store data in doc1 format would mean more db file size than to store in doc2.

Other than that would are the disadvantages or negative effects in terms of speed or RAM or any other optimization?


回答1:


You correctly noticed that the documents will have different size. So you will save at least 15 bytes per document (60% for similar documents) if you decide to adopt the second schema. This will end up in something like 140MB for your 10 million records. This will give you the following advantage:

  • HDD savings. The only problem is that looking at the prices for current HDD this is mostly useless.
  • RAM saving. In comparison with hard discs, this can be useful for indexing. In mongodb working set of indexes should fit in RAM to achieve a good performance. So if you will have indexes on these two fields, you will not only save 140MB of HDD space but also 140MB of potential RAM space (which is actually noticable).
  • I/O. A lot of bottlenecks happens due to the limitation of input/output system (the speed of reading/writing from the disk is limited). For your documents, this means that with schema 2 you can potentially read/write twice as many documents per 1 second.
  • network. In a lot of situations network is even way slower then IO, and if you DB server is on different machine then you application server the data has to be sent over the wire. And you will also be able to send twice as much data.

After telling about advantages, I have to tell you a disadvantage for a small keys:

  • readability of the database. When you do db.coll.findOne() and sees {_id: 1, t: 13423, a: 3, b:0.2} it is pretty hard to understand what is exactly stored here.
  • readability of the application similar with the database, but at least here you can have a solution. With a mapping logic, which transforms currentDate to c and price to p you can write a clean code and have a short schema.


来源:https://stackoverflow.com/questions/28492785/cost-of-keys-in-json-document-database-mongodb-elasticsearch

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!