发表新帖

发表新帖

Pytorch: How to create an update rule that doesn't come from derivatives?

后端未结

关注

 1  1396

I want to implement the following algorithm, taken from this book, section 13.6:

I don\'t understand how to implement the update rule in pytorch (the rule f

相关标签:

1条回答

广开言路

2020-12-18 03:01
I am gonna give this a try.

.backward() does not need a loss function, it just needs a differentiable scalar output. It approximates a gradient with respect to the model parameters. Let's just look at the first case the update for the value function.

We have one gradient appearing for v, we can approximate this gradient by
```
v = model(s)
v.backward()
```
This gives us a gradient of v which has the dimension of your model parameters. Assuming we already calculated the other parameter updates, we can calculate the actual optimizer update:
```
for i, p in enumerate(model.parameters()):
    z_theta[i][:] = gamma * lamda * z_theta[i] + l * p.grad
    p.grad[:] = alpha * delta * z_theta[i]
```
We can then use opt.step() to update the model parameters with the adjusted gradient.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题