How to set slurm/salloc for 1 gpu per task but let job use multiple gpus?

我与影子孤独终老i 提交于 2021-02-18 18:13:36

问题


We are looking for some advice with slurm salloc gpu allocations. Currently, given:

% salloc -n 4 -c 2 -gres=gpu:1
% srun env | grep CUDA   
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0

However, we desire more than just device 0 to be used.
Is there a way to specify an salloc with srun/mpirun to get the following?

CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=1
CUDA_VISIBLE_DEVICES=2
CUDA_VISIBLE_DEVICES=3

This is desired such that each task gets 1 gpu, but overall gpu usage is spread out among the 4 available devices (see gres.conf below). Not where all tasks get device=0.

That way each task is not waiting on device 0 to free up from other tasks, as is currently the case.

Or is this expected behavior even if we have more than 1 gpu available/free (4 total) for the 4 tasks? What are we missing or misunderstanding?

  • salloc / srun parameter?
  • slurm.conf or gres.conf setting?

Summary We want to be able to use slurm and mpi such that each rank/task uses 1 gpu, but the job can spread tasks/ranks among the 4 gpus. Currently it appears we are limited to device 0 only. We also want to avoid multiple srun submissions within an salloc/sbatch due to mpi usage.

OS: CentOS 7

Slurm version: 16.05.6

Are we forced to use wrapper based methods for this?

Are there differences with slurm version (14 to 16) in how gpus are allocated?

Thank you!

Reference: gres.conf

Name=gpu File=/dev/nvidia0
Name=gpu File=/dev/nvidia1
Name=gpu File=/dev/nvidia2
Name=gpu File=/dev/nvidia3

回答1:


First of all, try requesting four GPUs with

% salloc -n 4 -c 2 -gres=gpu:4

With --gres=gpu:1, it is the expected behaviour that all tasks see only one GPU. With --gres=gpu:4, the output would be

CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3
CUDA_VISIBLE_DEVICES=0,1,2,3

To get what you want, you can use a wrapper script, or modify your srun command like this:

srun bash -c 'CUDA_VISIBLE_DEVICES=$SLURM_PROCID env' | grep CUDA

then you will get

CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=1
CUDA_VISIBLE_DEVICES=2
CUDA_VISIBLE_DEVICES=3



回答2:


This feature is planned for 19.05. See https://bugs.schedmd.com/show_bug.cgi?id=4979 for details.

Be warned that the 'srun bash...' solution suggested will break if your job doesn't request all GPUs on that node, because another process may be in control of GPU0.



来源:https://stackoverflow.com/questions/46061043/how-to-set-slurm-salloc-for-1-gpu-per-task-but-let-job-use-multiple-gpus

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!