I have a full inverted index in form of nested python dictionary. Its structure is :
{word : { doc_name : [location_list] } }
For example l
A common pattern in Python 2.x is to have one version of a module implemented in pure Python, with an optional accelerated version implemented as a C extension; for example, pickle and cPickle. This places the burden of importing the accelerated version and falling back on the pure Python version on each user of these modules. In Python 3.0, the accelerated versions are considered implementation details of the pure Python versions. Users should always import the standard version, which attempts to import the accelerated version and falls back to the pure Python version. The pickle / cPickle pair received this treatment.
If your dictionary is huge and should only be compatible with Python 3.4 or higher, use:
pickle.dump(obj, file, protocol=4)
pickle.load(file, encoding="bytes")
or:
Pickler(file, 4).dump(obj)
Unpickler(file).load()
That said, in 2010 the json module was 25 times faster at encoding and 15 times faster at decoding simple types than pickle. My 2014 benchmark says marshal > pickle > json, but marshal's coupled to specific Python versions.