How to use pytorch to construct multi-task DNN, e.g., for more than 100 tasks?

问题

Below is the example code to use pytorch to construct DNN for two regression tasks. The forward function returns two outputs (x1, x2). How about the network for lots of regression/classification tasks? e.g., 100 or 1000 outputs. It definitely not a good idea to hardcode all the outputs (e.g., x1, x2, ..., x100). Is there an simple method to do that? Thank you.

import torch
from torch import nn
import torch.nn.functional as F

class mynet(nn.Module):
    def __init__(self):
        super(mynet, self).__init__()
        self.lin1 = nn.Linear(5, 10)
        self.lin2 = nn.Linear(10, 3)
        self.lin3 = nn.Linear(10, 4)

    def forward(self, x):
        x = self.lin1(x)
        x1 = self.lin2(x)
        x2 = self.lin3(x)
        return x1, x2

if __name__ == '__main__':
    x = torch.randn(1000, 5)
    y1 = torch.randn(1000, 3)
    y2 = torch.randn(1000,  4)
    model = mynet()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-4)
    for epoch in range(100):
        model.train()
        optimizer.zero_grad()
        out1, out2 = model(x)
        loss = 0.2 * F.mse_loss(out1, y1) + 0.8 * F.mse_loss(out2, y2)
        loss.backward()
        optimizer.step()

回答1:

You can (and should) use nn containers such as nn.ModuleList or nn.ModuleDict to manage arbitrary number of sub-modules.

For example (using nn.ModuleList):

class MultiHeadNetwork(nn.Module):
    def __init__(self, list_with_number_of_outputs_of_each_head):
        super(MultiHeadNetwork, self).__init__()
        self.backbone = ...  # build the basic "backbone" on top of which all other heads come
        # all other "heads"
        self.heads = nn.ModuleList([])
        for nout in list_with_number_of_outputs_of_each_head:
            self.heads.append(nn.Sequential(
              nn.Linear(10, nout * 2),
              nn.ReLU(inplace=True),
              nn.Linear(nout * 2, nout)))

    def forward(self, x):
        common_features = self.backbone(x)  # compute the shared features
        outputs = []
        for head in self.heads:
            outputs.append(head(common_features))
        return outputs

Note that in this example each head is more complex than a single nn.Linear layer.
The number of different "heads" (and number of outputs) is determined by the length of the argument list_with_number_of_outputs_of_each_head.

Important notice: it is crucial to use nn containers, rather than simple pythonic lists/dictionary to store all sub modules. Otherwise pytorch will have difficulty managing all sub modules.
See, e.g., this answer, this question and this one.

来源：https://stackoverflow.com/questions/59763775/how-to-use-pytorch-to-construct-multi-task-dnn-e-g-for-more-than-100-tasks

标签

deep-learning

regression

classification

pytorch