问题
I am learning pytorch and follow this tutorial: https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html
I use this command to use a GPU.
device = torch.device("**cuda:0**" if torch.cuda.is_available() else "cpu")
But, I want to use two GPUs in jupyter, like this:
device = torch.device("**cuda:0,1**" if torch.cuda.is_available() else "cpu")
Of course, this is wrong. So, How can I do this?
回答1:
Using multi-GPUs is as simply as wrapping a model in DataParallel and increasing the batch size. Check these two tutorials for a quick start:
- Multi-GPU Examples
- Data Parallelism
回答2:
Assuming that you want to distribute the data across the available GPUs (If you have batch size of 16, and 2 GPUs, you might be looking providing the 8 samples to each of the GPUs), and not really spread out the parts of models across difference GPU's. This can be done as follows:
If you want to use all the available GPUs:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = CreateModel()
model= nn.DataParallel(model)
model.to(device)
If you want to use specific GPUs: (For example, using 2 out of 4 GPUs)
device = torch.device("cuda:1,3" if torch.cuda.is_available() else "cpu") ## specify the GPU id's, GPU id's start from 0.
model = CreateModel()
model= nn.DataParallel(model,device_ids = [1, 3])
model.to(device)
To use the specific GPU's by setting OS environment variable:
Before executing the program, set CUDA_VISIBLE_DEVICES variable as follows:
export CUDA_VISIBLE_DEVICES=1,3 (Assuming you want to select 2nd and 4th GPU)
Then, within program, you can just use DataParallel() as though you want to use all the GPUs. (similar to 1st case). Here the GPUs available for the program is restricted by the OS environment variable.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = CreateModel()
model= nn.DataParallel(model)
model.to(device)
In all of these cases, the data has to be mapped to the device.
If X and y are the data:
X.to(device)
y.to(device)
回答3:
Another option would be to use some helper libraries for PyTorch:
PyTorch Ignite library Distributed GPU training
In there there is a concept of context manager for distributed configuration on:
- nccl - torch native distributed configuration on multiple GPUs
- xla-tpu - TPUs distributed configuration
PyTorch Lightning Multi-GPU training
This is of possible the best option IMHO to train on CPU/GPU/TPU without changing your original PyTorch code.
Worth cheking Catalyst for similar distributed GPU options.
来源:https://stackoverflow.com/questions/54216920/how-to-use-multiple-gpus-in-pytorch