Adding L1/L2 regularization in PyTorch?

后端 未结 5 2104
南旧
南旧 2020-12-24 01:20

Is there any way, I can add simple L1/L2 regularization in PyTorch? We can probably compute the regularized loss by simply adding the data_loss with the r

相关标签:
5条回答
  • 2020-12-24 01:47

    Interesting torch.norm is slower on CPU and faster on GPU vs. direct approach.

    import torch
    x = torch.randn(1024,100)
    y = torch.randn(1024,100)
    
    %timeit torch.sqrt((x - y).pow(2).sum(1))
    %timeit torch.norm(x - y, 2, 1)
    

    Out:

    1000 loops, best of 3: 910 µs per loop
    1000 loops, best of 3: 1.76 ms per loop
    

    On the other hand:

    import torch
    x = torch.randn(1024,100).cuda()
    y = torch.randn(1024,100).cuda()
    
    %timeit torch.sqrt((x - y).pow(2).sum(1))
    %timeit torch.norm(x - y, 2, 1)
    

    Out:

    10000 loops, best of 3: 50 µs per loop
    10000 loops, best of 3: 26 µs per loop
    
    0 讨论(0)
  • 2020-12-24 01:52

    for L1 regularization and inclulde weight only:

    L1_reg = torch.tensor(0., requires_grad=True)
    for name, param in model.named_parameters():
        if 'weight' in name:
            L1_reg = L1_reg + torch.norm(param, 1)
    
    total_loss = total_loss + 10e-4 * L1_reg
    
    0 讨论(0)
  • 2020-12-24 02:02

    For L2 regularization,

    l2_lambda = 0.01
    l2_reg = torch.tensor(0.)
    for param in model.parameters():
        l2_reg += torch.norm(param)
    loss += l2_lambda * l2_reg
    

    References:

    • https://discuss.pytorch.org/t/how-does-one-implement-weight-regularization-l1-or-l2-manually-without-optimum/7951.
    • http://pytorch.org/docs/master/torch.html?highlight=norm#torch.norm.
    0 讨论(0)
  • 2020-12-24 02:04

    Following should help for L2 regularization:

    optimizer = torch.optim.Adam(model.parameters(), lr=1e-4, weight_decay=1e-5)
    
    0 讨论(0)
  • 2020-12-24 02:08

    This is presented in the documentation for PyTorch. Have a look at http://pytorch.org/docs/optim.html#torch.optim.Adagrad. You can add L2 loss using the weight decay parameter to the Optimization function.

    0 讨论(0)
提交回复
热议问题