slurm exceeded job memory limit with python multiprocessing

问题

I'm using slurm to manage some of our calculations but sometimes jobs are getting killed with an out-of-memory error even though this should not be the case. This strange issue has been with python jobs using multiprocessing in particular.

Here's a minimal example to reproduce this behavior

#!/usr/bin/python

from time import sleep

nmem = int(3e7) # this will amount to ~1GB of numbers
nprocs = 200    # will create this many workers later
nsleep = 5      # sleep seconds

array = list(range(nmem))  # allocate some memory

print("done allocating memory")
sleep(nsleep)
print("continuing with multiple processes (" + str(nprocs) + ")")

from multiprocessing import Pool

def f(i):
    sleep(nsleep)

# this will create a pool of workers, each of which "seem" to use 1GB
# even though the individual processes don't actually allocate any memory
p = Pool(nprocs)
p.map(f,list(range(nprocs)))

print("finished successfully")

Even though this may run fine locally, slurm memory acccounting seems to sum up the resident memory for each of these processes, leading to a memory use of nprocs x 1GB, rather than just 1 GB (the actual mem use). That's not what it should do I think and it's not what the OS is doing, it doesn't appear to be swapping or anything.

Here's the output, if I run the code locally

> python test-slurm-mem.py 
done allocation memory
continuing with multiple processes (0)
finished successfully

And a screenshot of htop

And here's the output if I run the same command using slurm

> srun --nodelist=compute3 --mem=128G python test-slurm-mem.py 
srun: job 694697 queued and waiting for resources
srun: job 694697 has been allocated resources
done allocating memory
continuing with multiple processes (200)
slurmstepd: Step 694697.0 exceeded memory limit (193419088 > 131968000), being killed
srun: Exceeded job memory limit
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: *** STEP 694697.0 ON compute3 CANCELLED AT 2018-09-20T10:22:53 ***
srun: error: compute3: task 0: Killed
> $ sacct --format State,ExitCode,JobName,ReqCPUs,MaxRSS,AveCPU,Elapsed -j 694697.0
     State ExitCode    JobName  ReqCPUS     MaxRSS     AveCPU    Elapsed 
---------- -------- ---------- -------- ---------- ---------- ---------- 
CANCELLED+      0:9     python        2 193419088K   00:00:04   00:00:13

来源：https://stackoverflow.com/questions/52421171/slurm-exceeded-job-memory-limit-with-python-multiprocessing

标签

python

memory-management

python-multiprocessing

slurm