How to prevent two CUDA programs from interfering

穿精又带淫゛_ 提交于 2019-11-30 05:42:34

问题


I've noticed that if two users try to run CUDA programs at the same time, it tends to lock up either the card or the driver (or both?). We need to either reset the card or reboot the machine to restore normal behavior.

Is there a way to get a lock on the GPU so other programs can't interfere while it's running?

Edit

OS is Ubuntu 11.10 running on a server. While there is no X Windows running, the card is used to display the text system console. There are multiple users.


回答1:


If you are running on either Linux or Windows with the TCC driver, you can put the GPU into compute exclusive mode using the nvidia-smi utility.

Compute exclusive mode makes the driver refuse a context establishment request if another process already holds a context on that GPU. Any process trying to run on a busy compute exclusive GPU will receive a no device available error and fail.




回答2:


You can use something like Task Spooler to queue the programs and run one at the time.

We use TORQUE Resource Manager but it's harder to configure than ts. With TORQUE you can have multiple queues (ie one for cuda jobs, two for cpu jobs) and assign a different job to each gpu.



来源:https://stackoverflow.com/questions/13900078/how-to-prevent-two-cuda-programs-from-interfering

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!