How to Reduce the time taken to load a pickle file in python

后端 未结 3 782
星月不相逢
星月不相逢 2020-12-08 02:47

I have created a dictionary in python and dumped into pickle. Its size went to 300MB. Now, I want to load the same pickle.

output = open(\'myfile.pkl\', \'rb         


        
3条回答
  •  Happy的楠姐
    2020-12-08 02:49

    I've had nice results in reading huge files (e.g: ~750 MB igraph object - a binary pickle file) using cPickle itself. This was achieved by simply wrapping up the pickle load call as mentioned here

    Example snippet in your case would be something like:

    import timeit
    import cPickle as pickle
    import gc
    
    
    def load_cpickle_gc():
        output = open('myfile3.pkl', 'rb')
    
        # disable garbage collector
        gc.disable()
    
        mydict = pickle.load(output)
    
        # enable garbage collector again
        gc.enable()
        output.close()
    
    
    if __name__ == '__main__':
        print "cPickle load (with gc workaround): "
        t = timeit.Timer(stmt="pickle_wr.load_cpickle_gc()", setup="import pickle_wr")
        print t.timeit(1),'\n'
    

    Surely, there might be more apt ways to get the task done, however, this workaround does reduce the time required drastically. (For me, it reduced from 843.04s to 41.28s, around 20x)

提交回复
热议问题