Pytorch: How to create an update rule that doesn't come from derivatives?

后端 未结 1 1396
野性不改
野性不改 2020-12-18 02:29

I want to implement the following algorithm, taken from this book, section 13.6:

I don\'t understand how to implement the update rule in pytorch (the rule f

相关标签:
1条回答
  • 2020-12-18 03:01

    I am gonna give this a try.

    .backward() does not need a loss function, it just needs a differentiable scalar output. It approximates a gradient with respect to the model parameters. Let's just look at the first case the update for the value function.

    We have one gradient appearing for v, we can approximate this gradient by

    v = model(s)
    v.backward()
    

    This gives us a gradient of v which has the dimension of your model parameters. Assuming we already calculated the other parameter updates, we can calculate the actual optimizer update:

    for i, p in enumerate(model.parameters()):
        z_theta[i][:] = gamma * lamda * z_theta[i] + l * p.grad
        p.grad[:] = alpha * delta * z_theta[i]
    

    We can then use opt.step() to update the model parameters with the adjusted gradient.

    0 讨论(0)
提交回复
热议问题