How to code adagrad in python theano

一笑奈何 提交于 2019-12-06 10:02:15

Perhaps you can utilize the following example for implementation of adadelta, and use it to derive your own. Please update if you succeeded :-)

I was looking for the same thing and ended up implementing it myself in the style of the resource zuuz already pointed out. So maybe this helps anyone looking for help here.

def adagrad(lr, tparams, grads, inp, cost):
    # stores the current grads
    gshared = [theano.shared(np.zeros_like(p.get_value(),
                                           dtype=theano.config.floatX),
                             name='%s_grad' % k)
               for k, p in tparams.iteritems()]
    grads_updates = zip(gshared, grads)
    # stores the sum of all grads squared
    hist_gshared = [theano.shared(np.zeros_like(p.get_value(),
                                                dtype=theano.config.floatX),
                                  name='%s_grad' % k)
                    for k, p in tparams.iteritems()]
    rgrads_updates = [(rg, rg + T.sqr(g)) for rg, g in zip(hist_gshared, grads)]

    # calculate cost and store grads
    f_grad_shared = theano.function(inp, cost,
                                    updates=grads_updates + rgrads_updates,
                                    on_unused_input='ignore')

    # apply actual update with the initial learning rate lr
    n = 1e-6
    updates = [(p, p - (lr/(T.sqrt(rg) + n))*g)
               for p, g, rg in zip(tparams.values(), gshared, hist_gshared)]

    f_update = theano.function([lr], [], updates=updates, on_unused_input='ignore')

    return f_grad_shared, f_update

I find this implementation from Lasagne very concise and readable. You can use it pretty much as it is:

for param, grad in zip(params, grads):
    value = param.get_value(borrow=True)
    accu = theano.shared(np.zeros(value.shape, dtype=value.dtype),
                         broadcastable=param.broadcastable)
    accu_new = accu + grad ** 2
    updates[accu] = accu_new
    updates[param] = param - (learning_rate * grad /
                              T.sqrt(accu_new + epsilon))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!