How do I prevent memory leak when I load large pickle files in a for loop?

醉酒当歌 提交于 2019-12-19 13:58:54

问题


I have 50 pickle files that are 0.5 GB each. Each pickle file is comprised of a list of custom class objects. I have no trouble loading the files individually using the following function:

def loadPickle(fp):
    with open(fp, 'rb') as fh:
        listOfObj = pickle.load(fh)
    return listOfObj

However, when I try to iteratively load the files I get a memory leak.

l = ['filepath1', 'filepath2', 'filepath3', 'filepath4']
for fp in l:
    x = loadPickle(fp)
    print( 'loaded {0}'.format(fp) )

My memory overflows before loaded filepath2 is printed. How can I write code that guarantees that only a single pickle is loaded during each iteration?

Answers to related questions on SO suggest using objects defined in the weakref module or explicit garbage collection using the gc module, but I am having a difficult time understanding how I would apply these methods to my particular use case. This is because I have an insufficient understanding of how referencing works under the hood.

Related: Python garbage collection


回答1:


You can fix that by adding x = None right after for fp in l:.

The reason this works is because it will dereferenciate variable x, hance allowing the python garbage collector to free some virtual memory before calling loadPickle() the second time.



来源:https://stackoverflow.com/questions/16288936/how-do-i-prevent-memory-leak-when-i-load-large-pickle-files-in-a-for-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!