No space left while using Multiprocessing.Array in shared memory

匿名 (未验证) 提交于 2019-12-03 01:33:01

问题:

I am using the multiprocessing functions of Python to run my code parallel on a machine with roughly 500GB of RAM. To share some arrays between the different workers I am creating a Array object:

N = 150 ndata = 10000 sigma = 3 ddim = 3  shared_data_base = multiprocessing.Array(ctypes.c_double, ndata*N*N*ddim*sigma*sigma) shared_data = np.ctypeslib.as_array(shared_data_base.get_obj()) shared_data = shared_data.reshape(-1, N, N, ddim*sigma*sigma)

This is working perfectly for sigma=1, but for sigma=3 one of the harddrives of the device is slowly filled, until there is no free space anymore and then the process fails with this exception:

OSError: [Errno 28] No space left on device

Now I've got 2 questions:

  1. Why does this code even write anything to the disc? Why isn't it all stored in the memory?
  2. How can I solve this problem? Can I make Python store it entireley in the RAM without writing it to the HDD? Or can I change the HDD on which this array is written?

EDIT: I found something online which suggests, that the array is stored in the "shared memory". But the /dev/shm device has plenty more free space as the /dev/sda1 which is filled up by the code above. Here is the (relevant part of the) strace log of this code.

Edit #2: I think that I have found a workarround for this problem. By looking at the source I found that multiprocessing tries to create a temporary file in a directory which is determinded by using

process.current_process()._config.get('tempdir')

Setting this value manually at the beginning of the script

from multiprocessing import process process.current_process()._config['tempdir'] =  '/data/tmp/'

seems to be solving this issue. But I think that this is not the best way to solve it. So: are there any other suggestions how to handle it?

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!