[Theano]How to evaluate gradient based on shared variables

你说的曾经没有我的故事 提交于 2019-12-12 02:32:54

问题


I'm currently facing this issue: I can't manage to evaluate my gradient symbolic variables in a Recurrent Neural Network coded with Theano. Here's the code :

  W_x = theano.shared(init_W_x, name='W_x')
  W_h = theano.shared(init_W_h, name='W_h')
  W_y = theano.shared(init_W_y, name='W_y')
  [self.y, self.h], _ = theano.scan(self.step,
                                    sequences=self.x,
                                    outputs_info=[None, self.h0])

  error = ((self.y - self.t) ** 2).sum()

  gW_x, gW_y, gW_h = T.grad(self.error, [W_x, W_h, W_y])

  [...]

  def step(self, x_t, h_tm1):
      h_t = T.nnet.sigmoid(T.dot(self.W_x, x_t) + T.dot(h_tm1, self.W_h))
      y_t = T.dot(self.W_y, h_t)
      return y_t, h_t

I kept just the things I thought were appropriate.
I would like to be able to compute for instance 'gW_x' but when I try to embbed it as a theano function it doesn't work because it's dependencies (W_x, W_h, W_y) are shared variables.

Thank you very much


回答1:


I believe that in this instance, you need to pass the shared variables to the function self.step in the non_sequences argument of theano.scan.

Therefore you need to change the signature of self.step to take three more arguments, corresponding to the shared variables, and then add the argument non_sequences=[W_x, W_h, W_y] to theano.scan.

Also, I suspect you may have made a typo in the penultimate line - should it be error = ((self.y - t) ** 2).sum()?



来源:https://stackoverflow.com/questions/37857771/theanohow-to-evaluate-gradient-based-on-shared-variables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!