Efficiently grab gradients from TensorFlow?

后端 未结 2 419
小鲜肉
小鲜肉 2020-12-24 03:20

I\'m trying to implement an asynchronous parameter server, DistBelief style using TensorFlow. I found that minimize() is split into two functions, compute_gradients and appl

2条回答
  •  爱一瞬间的悲伤
    2020-12-24 04:10

    Just give you a simple example. Understand it and try your specific task out.

    Initialize required symbols.

    x = tf.Variable(0.5)
    y = x*x
    opt = tf.train.AdagradOptimizer(0.1)
    grads = opt.compute_gradients(y)
    grad_placeholder = [(tf.placeholder("float", shape=grad[1].get_shape()), grad[1] for grad in grads]
    apply_placeholder_op = opt.apply_gradients(grad_placeholder)
    transform_grads = [(function1(grad[0]), grad[1]) for grad in grads]
    apply_transform_op = opt.apply_gradients(transform_grads)
    

    Initialize

    sess = tf.Session()
    sess.run(tf.initialize_all_variables())
    

    Get all gradients

    grad_vals = sess.run([grad[0] for grad in grads])
    

    Apply gradients

    feed_dict = {}
    for i in xrange(len(grad_placeholder)):
        feed_dict[grad_placeholder[i][0]] = function2(grad_vals[i])
    sess.run(apply_placeholder_op, feed_dict=feed_dict)
    sess.run(apply_transform_op)
    

    Note: the code hasn't been tested by myself, but I confirm the code is legal except minor code errors. Note: function1 and function2 is kind of computation, such as 2*x, x^e or e^x and so on.

    Refer: TensorFlow apply_gradients remotely

提交回复
热议问题