TensorBoard: How to plot histogram for gradients?

问题

TensorBoard had the function to plot histograms of Tensors at session-time. I want a histogram for the gradients during training.

tf.gradients(yvars,xvars) returns a list a gradients.

However, tf.histogram_summary('name',Tensor) accepts only Tensors, not lists of Tensors.

For the time being, I made a work-around. I flatten all Tensors to a column vector and concatenate them:

for l in xrange(listlength): col_vec = tf.reshape(grads[l],[-1,1]) g = tf.concat(0,[g,col_vec]) grad_hist = tf.histogram_summary("name", g)

What would be a better way to plot the histogram for the gradient?

It seems a common thing to do, so I hope TensorFlow would have a dedicated function for this.

回答1:

Following the suggestion from @user728291, I was able to view gradients in tensorboard by using the the optimize_loss function as follows. The function calling syntax for optimize_loss is

optimize_loss(
loss,
global_step,
learning_rate,
optimizer,
gradient_noise_scale=None,
gradient_multipliers=None,
clip_gradients=None,
learning_rate_decay_fn=None,
update_ops=None,
variables=None,
name=None,
summaries=None,
colocate_gradients_with_ops=False,
increment_global_step=True
)

The function requires global_step and is dependent on some other imports as shown next.

from tensorflow.python.ops import variable_scope
from tensorflow.python.framework import dtypes
from tensorflow.python.ops import init_ops
global_step = variable_scope.get_variable(  # this needs to be defined for tf.contrib.layers.optimize_loss()
      "global_step", [],
      trainable=False,
      dtype=dtypes.int64,
      initializer=init_ops.constant_initializer(0, dtype=dtypes.int64))

Then replace your typical training operation

training_operation = optimizer.minimize(loss_operation)

with

training_operation = tf.contrib.layers.optimize_loss(
      loss_operation, global_step, learning_rate=rate, optimizer='Adam',
      summaries=["gradients"])

Then have a merge statement for your summaries

summary = tf.summary.merge_all()

Then in your tensorflow session at the end of each run/epoch:

summary_writer = tf.summary.FileWriter(logdir_run_x, sess.graph) 
summary_str = sess.run(summary, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, i)
summary_writer.flush()  # evidently this is needed sometimes or scalars will not show up on tensorboard.

Where logdir_run_x is a different directory for each run. That way when TensorBoard runs, you can look at each run separately. The gradients will be under the histogram tab and will have the label OptimizeLoss. It will show all the weights, all the biases, and the beta parameter as histograms.

UPDATE: Using tf slim, there is another way that also works and is perhaps cleaner.

optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = slim.learning.create_train_op(loss_operation, optimizer,summarize_gradients=True)

By setting summarize_gradients=True, which is not the default, you will then get gradient summaries for all weights. These will be viewable in Tensorboard under summarize_grads

回答2:

Another solution (based on this quora answer) is to access the gradients directly from the optimizer you are already using.

optimizer = tf.train.AdamOptimizer(..)
grads = optimizer.compute_gradients(loss)
grad_summ_op = tf.summary.merge([tf.summary.histogram("%s-grad" % g[1].name, g[0]) for g in grads])
grad_vals = sess.run(fetches=grad_summ_op, feed_dict = feed_dict)
writer['train'].add_summary(grad_vals)

来源：https://stackoverflow.com/questions/36392952/tensorboard-how-to-plot-histogram-for-gradients

标签

python

machine-learning

tensorflow

tensorboard