Backward pass in Caffe Python Layer is not called/working?

二次信任 提交于 2019-11-29 05:13:28
Shai

In addition to Erik B.'s answer, you can force caffe to backprob by specifying

force_backward: true

In your net prototxt.
See comments in caffe.proto for more information.

This is the intended behaviour since you do not have any layers "below" your python layer that actually need the gradients to compute the weight updates. Caffe notices this and skips the backward computation for such layers because it would be a waste of time.

Caffe prints for all layers if the backward computation is needed in the log at the network initialization time. In your case, you should see something like:

fc1 does not need backward computation.

If you put an "InnerProduct" or "Convolution" layer below your "Python" layer (eg. Data->InnerProduct->Python->Loss) the backward computation becomes necessary and your backward method gets called.

Mine wasn't working even though I did set force_backward: true as suggested by David Stutz. I found out here and here that I was forgetting to set the diff of the last layer to 1 at the index of the target class.

As Mohit Jain describes in his caffe-users answer, if you are doing ImageNet classification with the tabby cat, after doing the forward pass, you'll have to do something like:

net.blobs['prob'].diff[0][281] = 1   # 281 is tabby cat. diff shape: (1, 1000)

Notice that you'll have to change the 'prob' accordingly to the name of your last layer, which is usually softmax and 'prob'.

Here's an example based on mine:


deploy.prototxt (it's loosely based on VGG16 just to show the structure of the file, but I didn't test it):

name: "smaller_vgg"
input: "data"
force_backward: true
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}
layer {
  name: "drop1"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  inner_product_param {
    num_output: 1000
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc2"
  top: "prob"
}

main.py:

import caffe

prototxt = 'deploy.prototxt'
model_file = 'smaller_vgg.caffemodel'
net = caffe.Net(model_file, prototxt, caffe.TRAIN)  # not sure if TEST works as well

image = cv2.imread('tabbycat.jpg', cv2.IMREAD_UNCHANGED)

net.blobs['data'].data[...] = image[np.newaxis, np.newaxis, :]
net.blobs['prob'].diff[0, 298] = 1
net.forward()
backout = net.backward()

# access grad from backout['data'] or net.blobs['data'].diff
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!