How to update model parameters with accumulated gradients?

前端 未结 6 2083
孤独总比滥情好
孤独总比滥情好 2020-12-05 05:14

I\'m using TensorFlow to build a deep learning model. And new to TensorFlow.

Due to some reason, my model has limited batch size, then this limited batch-size will m

6条回答
  •  长情又很酷
    2020-12-05 05:56

    I had the same problem and just figured it out.

    First get symbolic gradients, then define accumulated gradients as tf.Variables. (It seems that tf.global_variables_initializer() has to be run before defining grads_accum. I got errors otherwise, not sure why.)

    tvars = tf.trainable_variables()
    optimizer = tf.train.GradientDescentOptimizer(lr)
    grads = tf.gradients(cost, tvars)
    
    # initialize
    tf.local_variables_initializer().run()
    tf.global_variables_initializer().run()
    
    grads_accum = [tf.Variable(tf.zeros_like(v)) for v in grads] 
    update_op = optimizer.apply_gradients(zip(grads_accum, tvars)) 
    

    In training you can accumulate gradients (saved in gradients_accum) at each batch, and update the model after running the 64-th batch:

    feed_dict = dict()
    for i, _grads in enumerate(gradients_accum):
        feed_dict[grads_accum[i]] = _grads
    sess.run(fetches=[update_op], feed_dict=feed_dict) 
    

    You can refer to tensorflow/tensorflow/python/training/optimizer_test.py for example usage, particularly this function: testGradientsAsVariables().

    Hope it helps.

提交回复
热议问题