How to use multiple GPUs in pytorch?

浪子不回头ぞ 提交于 2020-12-28 07:00:57

问题


I am learning pytorch and follow this tutorial: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html

I use this command to use a GPU.

device = torch.device("**cuda:0**" if torch.cuda.is_available() else "cpu")

But, I want to use two GPUs in jupyter, like this:

device = torch.device("**cuda:0,1**" if torch.cuda.is_available() else "cpu")

Of course, this is wrong. So, How can I do this?


回答1:


Using multi-GPUs is as simply as wrapping a model in DataParallel and increasing the batch size. Check these two tutorials for a quick start:

  • Multi-GPU Examples
  • Data Parallelism



回答2:


Assuming that you want to distribute the data across the available GPUs (If you have batch size of 16, and 2 GPUs, you might be looking providing the 8 samples to each of the GPUs), and not really spread out the parts of models across difference GPU's. This can be done as follows:

If you want to use all the available GPUs:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = CreateModel()

model= nn.DataParallel(model)
model.to(device)

If you want to use specific GPUs: (For example, using 2 out of 4 GPUs)

device = torch.device("cuda:1,3" if torch.cuda.is_available() else "cpu") ## specify the GPU id's, GPU id's start from 0.

model = CreateModel()

model= nn.DataParallel(model,device_ids = [1, 3])
model.to(device)

To use the specific GPU's by setting OS environment variable:

Before executing the program, set CUDA_VISIBLE_DEVICES variable as follows:

export CUDA_VISIBLE_DEVICES=1,3 (Assuming you want to select 2nd and 4th GPU)

Then, within program, you can just use DataParallel() as though you want to use all the GPUs. (similar to 1st case). Here the GPUs available for the program is restricted by the OS environment variable.

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = CreateModel()

model= nn.DataParallel(model)
model.to(device)

In all of these cases, the data has to be mapped to the device.

If X and y are the data:

X.to(device)
y.to(device)




回答3:


Another option would be to use some helper libraries for PyTorch:

PyTorch Ignite library Distributed GPU training

In there there is a concept of context manager for distributed configuration on:

  • nccl - torch native distributed configuration on multiple GPUs
  • xla-tpu - TPUs distributed configuration

PyTorch Lightning Multi-GPU training

This is of possible the best option IMHO to train on CPU/GPU/TPU without changing your original PyTorch code.

Worth cheking Catalyst for similar distributed GPU options.



来源:https://stackoverflow.com/questions/54216920/how-to-use-multiple-gpus-in-pytorch

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!