Gradient clipping appears to choke on None

匿名 (未验证) 提交于 2019-12-03 07:50:05

问题:

I'm trying to add gradient clipping to my graph. I used the approach recommended here: How to effectively apply gradient clipping in tensor flow?

    optimizer = tf.train.GradientDescentOptimizer(learning_rate)     if gradient_clipping:         gradients = optimizer.compute_gradients(loss)         clipped_gradients = [(tf.clip_by_value(grad, -1, 1), var) for grad, var in gradients]         opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)     else:         opt = optimizer.minimize(loss, global_step=global_step) 

But when I turn on gradient clipping, I get the following stack trace:

<ipython-input-19-be0dcc63725e> in <listcomp>(.0)      61         if gradient_clipping:      62             gradients = optimizer.compute_gradients(loss) ---> 63             clipped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients]      64             opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)      65         else:  /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/ops/clip_ops.py in clip_by_value(t, clip_value_min, clip_value_max, name)      51   with ops.op_scope([t, clip_value_min, clip_value_max], name,      52                    "clip_by_value") as name: ---> 53     t = ops.convert_to_tensor(t, name="t")      54       55     # Go through list of tensors, for each value in each tensor clip  /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/ops.py in convert_to_tensor(value, dtype, name, as_ref)     619     for base_type, conversion_func in funcs_at_priority:     620       if isinstance(value, base_type): --> 621         ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)     622         if ret is NotImplemented:     623           continue  /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)     178                                          as_ref=False):     179   _ = as_ref --> 180   return constant(v, dtype=dtype, name=name)     181      182   /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/constant_op.py in constant(value, dtype, shape, name)     161   tensor_value = attr_value_pb2.AttrValue()     162   tensor_value.tensor.CopyFrom( --> 163       tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))     164   dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)     165   const_tensor = g.create_op(  /home/armence/mlsandbox/venv/lib/python3.4/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape)     344   else:     345     if values is None: --> 346       raise ValueError("None values not supported.")     347     # if dtype is provided, forces numpy array to be the type     348     # provided if possible.  ValueError: None values not supported. 

How do I solve this problem?

回答1:

So, one option that seems to work is this:

    optimizer = tf.train.GradientDescentOptimizer(learning_rate)     if gradient_clipping:         gradients = optimizer.compute_gradients(loss)          def ClipIfNotNone(grad):             if grad is None:                 return grad             return tf.clip_by_value(grad, -1, 1)         clipped_gradients = [(ClipIfNotNone(grad), var) for grad, var in gradients]         opt = optimizer.apply_gradients(clipped_gradients, global_step=global_step)     else:         opt = optimizer.minimize(loss, global_step=global_step) 

It looks like compute_gradients returns None instead of a zero tensor when the gradient would be a zero tensor and tf.clip_by_value does not support a None value. So just don't pass None to it and preserve None values.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!