I am using mongodb to store raw HTML data of web pages using scrapy framework. In one day of web scraping 25GB disk space is filled up. Is there a way to store raw data in c
There's nothing built in for compression. Some operating systems offer disk/file compression, but if you want more control, I'd suggest you compress it using a library for whatever programming language you're using and manually control the compression.
For example, NodeJs offers simple convenience methods for this: http://nodejs.org/api/zlib.html#zlib_examples
If you choose to switch to the new storage engine WiredTiger which ships with 3.0, you can choose between several types of compression as documented here. Of course, you'll want to test this change in production workloads to find if the additional CPU utilization is worth the compression received.