Tensorflow: Compute Hessian matrix (only diagonal part) with respect to a high rank tensor

我的未来我决定 提交于 2019-12-23 05:16:03

问题


I would like to compute the first and the second derivatives(diagonal part of Hessian) of my specified Loss with respect to each feature map of a vgg16 conv4_3 layer's kernel which is a 3x3x512x512 dimensional matrix. I know how to compute derivatives if it is respected to a low-rank one according to How to compute all second derivatives (only the diagonal of the Hessian matrix) in Tensorflow? However, when it turns to higher-rank, I got completed lost.

# Inspecting variables under Ipython notebook
In  : Loss 
Out : <tf.Tensor 'local/total_losses:0' shape=() dtype=float32>

In  : conv4_3_kernel.get_shape() 
Out : TensorShape([Dimension(3), Dimension(3), Dimension(512), Dimension(512)])

## Compute derivatives
Grad = tf.compute_gradients(Loss, conv4_3_kernel)
Hessian = tf.compute_gradients(Grad, conv4_3_kernel)

In  : Grad 
Out : [<tf.Tensor 'gradients/vgg/conv4_3/Conv2D_grad/Conv2DBackpropFilter:0' shape=(3, 3, 512, 512) dtype=float32>]

In  : Hessian 
Out : [<tf.Tensor 'gradients_2/vgg/conv4_3/Conv2D_grad/Conv2DBackpropFilter:0' shape=(3, 3, 512, 512) dtype=float32>]

Please help me to check my understandings. So, for conv4_3_kernel, each dim stand for [Kx, Ky, in_channels, out_channels], so Grad should be partial derivatives of Loss with respect to each element(pixel) in the each feature maps. And Hessian is the second derivatives.

But, Hessian computes all the derivatives, how can I only compute only the diagonal part? should I use tf.diag_part()? Many thanks in advance!


回答1:


tf.compute_gradients computes derivative of a scalar quantity. If the quantity provided isn't scalar, it turns it into scalar by summing up the components which is what's happening in your example

To compute full Hessian you need n calls to tf.gradients, The example is here. If you want just the diagonal part, then modify arguments to ith call to tf.gradients to differentiate with respect to ith variable, rather than all variables.



来源:https://stackoverflow.com/questions/40101165/tensorflow-compute-hessian-matrix-only-diagonal-part-with-respect-to-a-high-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!