Memory leak using pandas dataframe

感情迁移 提交于 2019-12-17 16:35:32

问题


I am using pandas.DataFrame in a multi-threaded code (actually a custom subclass of DataFrame called Sound). I have noticed that I have a memory leak, since the memory usage of my program augments gradually over 10mn, to finally reach ~100% of my computer memory and crash.

I used objgraph to try tracking this leak, and found out that the count of instances of MyDataFrame is going up all the time while it shouldn't : every thread in its run method creates an instance, makes some calculations, saves the result in a file and exits ... so no references should be kept.

Using objgraph I found that all the data frames in memory have a similar reference graph :

I have no idea if that's normal or not ... it looks like this is what is keeping my objects in memory. Any idea, advice, insight ?


回答1:


Confirmed that there's some kind of memory leak going on in the indexing infrastructure. It's not caused by the above reference graph. Let's move the discussion to GitHub (SO is for Q&A):

https://github.com/pydata/pandas/issues/2659

EDIT: this actually appears to not be a memory leak at all, but has to do with the OS memory allocation issues perhaps. Please have a look at the github issue for more information



来源:https://stackoverflow.com/questions/14224068/memory-leak-using-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!