All,
When you train a large model with large amount samples, some samples may be cause NaN gradient when parameter updating.
And I want to find these samples
You could use tf.is_nan in combination with tf.cond to only execute the rest of your code if the loss is not NaN.
You can check whether your gradients have NaN by tf.check_numerics:
grad_check = tf.check_numerics(clipped_gradients)
with tf.control_dependencies([grad_check]):
self.optimizer = opt.apply_gradients(zip(clipped_gradients, params))
The grad_check would throw InvalidArgument if clipped_gradients is NaN or infinity.
The tf.control_dependencies makes sure that the grad_check is evaluated before applying the gradients.
Also see tf.add_check_numerics_ops().