Is there any way, I can add simple L1/L2 regularization in PyTorch? We can probably compute the regularized loss by simply adding the data_loss
with the r
Interesting torch.norm
is slower on CPU and faster on GPU vs. direct approach.
import torch
x = torch.randn(1024,100)
y = torch.randn(1024,100)
%timeit torch.sqrt((x - y).pow(2).sum(1))
%timeit torch.norm(x - y, 2, 1)
Out:
1000 loops, best of 3: 910 µs per loop
1000 loops, best of 3: 1.76 ms per loop
On the other hand:
import torch
x = torch.randn(1024,100).cuda()
y = torch.randn(1024,100).cuda()
%timeit torch.sqrt((x - y).pow(2).sum(1))
%timeit torch.norm(x - y, 2, 1)
Out:
10000 loops, best of 3: 50 µs per loop
10000 loops, best of 3: 26 µs per loop
for L1 regularization and inclulde weight
only:
L1_reg = torch.tensor(0., requires_grad=True)
for name, param in model.named_parameters():
if 'weight' in name:
L1_reg = L1_reg + torch.norm(param, 1)
total_loss = total_loss + 10e-4 * L1_reg
For L2 regularization,
l2_lambda = 0.01
l2_reg = torch.tensor(0.)
for param in model.parameters():
l2_reg += torch.norm(param)
loss += l2_lambda * l2_reg
References:
Following should help for L2 regularization:
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4, weight_decay=1e-5)
This is presented in the documentation for PyTorch. Have a look at http://pytorch.org/docs/optim.html#torch.optim.Adagrad. You can add L2 loss using the weight decay parameter to the Optimization function.