Store mongodb data in compressed format

谁都会走 提交于 2019-12-18 22:35:12

问题


I am using mongodb to store raw HTML data of web pages using scrapy framework. In one day of web scraping 25GB disk space is filled up. Is there a way to store raw data in compressed format.


回答1:


There's nothing built in for compression. Some operating systems offer disk/file compression, but if you want more control, I'd suggest you compress it using a library for whatever programming language you're using and manually control the compression.

For example, NodeJs offers simple convenience methods for this: http://nodejs.org/api/zlib.html#zlib_examples

3.0 Update

If you choose to switch to the new storage engine WiredTiger which ships with 3.0, you can choose between several types of compression as documented here. Of course, you'll want to test this change in production workloads to find if the additional CPU utilization is worth the compression received.




回答2:


Starting with 2.8 version of Mongo, you can use compression. You will have 3 levels of compression with WiredTiger engine, mmap (which is default in 2.6 does not provide compression):

  • None
  • snappy (by default)
  • zlib

Here is an example of how much space will you be able to save for 16 GB of data:

data is taken from this article.




回答3:


You can store your string like this to compress it: myhtml.encode('zlib')



来源:https://stackoverflow.com/questions/18014541/store-mongodb-data-in-compressed-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!