How to use multiprocessing in PyTorch?

问题

I'm trying to use PyTorch with complex loss function. In order to accelerate the code, I hope that I can use the PyTorch multiprocessing package.

The first trial, I put 10x1 features into the NN and get 10x4 output.

After that, I want to pass 10x4 parameters into a function to do some calculation. (The calculation will be complex in the future.)

After calculating, the function will return a 10x1 array in total. This array will be set as NN_energy and calculate loss function.

Besides, I also want to know if there is another method to create a backward-able array to store the NN_energy array, instead of using

NN_energy = net(Data_in)[0:10,0]

Thanks a lot.

Full Code:

import torch
import numpy as np
from torch.autograd import Variable 
from torch import multiprocessing

def func(msg,BOP):
    ans = (BOP[msg][0]+BOP[msg][1]/BOP[msg][2])*BOP[msg][3]
    return ans

class Net(torch.nn.Module):
    def __init__(self, n_feature, n_hidden_1, n_hidden_2, n_output):
        super(Net, self).__init__()
        self.hidden_1 = torch.nn.Linear(n_feature , n_hidden_1)  # hidden layer
        self.hidden_2 = torch.nn.Linear(n_hidden_1, n_hidden_2)  # hidden layer
        self.predict  = torch.nn.Linear(n_hidden_2, n_output  )  # output layer

    def forward(self, x):
        x = torch.tanh(self.hidden_1(x))      # activation function for hidden layer
        x = torch.tanh(self.hidden_2(x))      # activation function for hidden layer
        x = self.predict(x)                   # linear output
        return x

if __name__ == '__main__': # apply_async
    Data_in      = Variable( torch.from_numpy( np.asarray(list(range( 0,10))).reshape(10,1) ).float() )
    Ground_truth = Variable( torch.from_numpy( np.asarray(list(range(20,30))).reshape(10,1) ).float() )

    net = Net( n_feature=1 , n_hidden_1=15 , n_hidden_2=15 , n_output=4 )     # define the network
    optimizer = torch.optim.Rprop( net.parameters() )
    loss_func = torch.nn.MSELoss()  # this is for regression mean squared loss 
    NN_output = net(Data_in)   
    args = range(0,10)
    pool = multiprocessing.Pool()
    return_data = pool.map( func, zip(args, NN_output) )
    pool.close()
    pool.join()

    NN_energy = net(Data_in)[0:10,0]  
    for i in range(0,10):
        NN_energy[i] = return_data[i]

    loss = torch.sqrt( loss_func( NN_energy , Ground_truth ) )     # must be (1. nn output, 2. target) 
    print(loss)

Error messages:

File "C:\ProgramData\Anaconda3\lib\site-packages\torch\multiprocessing\reductions.py", line 126, in reduce_tensor raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "

RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).

回答1:

First of all, Torch Variable API is deprecated since a very long time, just don't use it.

Next, torch.from_numpy( np.asarray(list(range( 0,10))).reshape(10,1) ).float() is wrong at many levels: np.asarray of list is useless since a copy will be performed anyway, and np.array takes list as input by design. Then, np.arange is available to return a range as numpy array, and it is also available on Torch. Next, specifying both dimension for reshape is useless and error prone, you could simply do reshape((-1, 1)), or even better unsqueeze(-1). Here is the simplified expression torch.arange(10, dtype=torch.float32, requires_grad=True).unsqueeze(-1).

Using multiprocessing pool is a bad practice if using batch processing is possible. It will be both way more efficient and readable. Indeed, performing N small algebraic operations in parallel is always slower and a larger single algebraic operation, and even more on GPU. More importantly, computing the gradient is not supported by multiprocessing, hence the error that you get. Yet, this is partially true, because it is supports for tensors on cpu since 1.6.0. Have a lok, to the official release changelog.

Could you post a more representative example of what func method could be to make sure you really need it ?

NB: Distributed autograd as you are looking is now available in Pytorch as an experimental feature available in beta since 1.6.0. Have a look to the official documentation.

来源：https://stackoverflow.com/questions/56174874/how-to-use-multiprocessing-in-pytorch

标签

pytorch