PyTorch does not converge when approximating square function with linear model

六月ゝ 毕业季﹏ 提交于 2019-12-11 09:58:36

问题


I'm trying to learn some PyTorch and am referencing this discussion here

The author provides a minimum working piece of code that illustrates how you can use PyTorch to solve for an unknown linear function that has been polluted with random noise.

This code runs fine for me.

However, when I change the function such that I want t = X^2, the parameter does not seem to converge.

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable

# Let's make some data for a linear regression.
A = 3.1415926
b = 2.7189351
error = 0.1
N = 100 # number of data points

# Data
X = Variable(torch.randn(N, 1))

# (noisy) Target values that we want to learn.
t = X * X + Variable(torch.randn(N, 1) * error)

# Creating a model, making the optimizer, defining loss
model = nn.Linear(1, 1)
optimizer = optim.SGD(model.parameters(), lr=0.05)
loss_fn = nn.MSELoss()

# Run training
niter = 50
for _ in range(0, niter):
    optimizer.zero_grad()
    predictions = model(X)
    loss = loss_fn(predictions, t)
    loss.backward()
    optimizer.step()

    print("-" * 50)
    print("error = {}".format(loss.data[0]))
    print("learned A = {}".format(list(model.parameters())[0].data[0, 0]))
    print("learned b = {}".format(list(model.parameters())[1].data[0]))

When I execute this code, the new A and b parameters are seemingly random thus it does not converge. I think this should converge because you can approximate any function with a slope and offset function. My theory is that I'm using PyTorch incorrectly.

Can any identify a problem with my t = X * X + Variable(torch.randn(N, 1) * error) line of code?


回答1:


You cannot fit a 2nd degree polynomial with a linear function. You cannot expect more than random (since you have random samples from the polynomial).
What you can do is try and have two inputs, x and x^2 and fit from them:

model = nn.Linear(2, 1)  # you have 2 inputs now
X_input = torch.cat((X, X**2), dim=1)  # have 2 inputs per entry
# ...

    predictions = model(X_input)  # 2 inputs -> 1 output
    loss = loss_fn(predictions, t)
    # ...
    # learning t = c*x^2 + a*x + b
    print("learned a = {}".format(list(model.parameters())[0].data[0, 0]))
    print("learned c = {}".format(list(model.parameters())[0].data[0, 1])) 
    print("learned b = {}".format(list(model.parameters())[1].data[0])) 


来源:https://stackoverflow.com/questions/55912952/pytorch-does-not-converge-when-approximating-square-function-with-linear-model

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!