I\'m implementing a program that needs to serialize and deserialize large objects, so I was making some tests with pickle
, cPickle
and marsha
You could improve the storage efficiency by compressing the serialize result.
My hunch are that compressing data and feeding it into the unserialize would be faster than reading raw from disk via HDD.
Test below was made to prove that compression would speed up the unserialize process. The result wasn't as expect since the machine were equip with SSD. On HHD equip machine compressing the data using lz4 would be faster since reading from disk average between 60-70mb/s.
LZ4: At a speed reduction of 18%, the compression yield 77.6% of additional storage.
marshal - compression speed time
Bz2 7.492605924606323 10363490
Lz4 1.3733329772949219 46018121
--- 1.126852035522461 205618472
cPickle - compression speed time
Bz2 15.488649845123291 10650522
Lz4 9.192650079727173 55388264
--- 8.839831113815308 204340701