问题
I am creating a network using Caffe, for which I need to define my own layer. I would like to use the Python
layer for this.
My layer will contain some learned parameters. From this answer, I am told that I will need to create a blob vector for this.
- Is there any specification that this blob will need to follow, such as constraints in dimensions, etc.? Irrespective of what my layer does, can I create a blob of one dimension, and use any element, one each, of the blob for any computation in the layer?
- What does the
diff
of a blob mean? From what I understand, thediff
ofbottom
is the gradient at the current layer, andtop
for the previous layer. However, what exactly is happening here? - When do these parameters get trained? Does this need to be done manually in the layer definition?
I have seen the examples in test_python_layer.py, but most of them do not have any parameters.
回答1:
You can add as many internal parameters as you wish, and these parameters (Blobs) may have whatever shape you want them to be.
To add Blobs (in your layer's class):
def setup(self, bottom, top):
self.blobs.add_blob(2) # add two blobs
self.blobs[0].reshape(3, 4) # first blob is 2D
self.blobs[0].data[...] = 0 # init
self.blobs[1].reshape(10) # second blob is 1D with 10 elements
self.blobs[1].data[...] = 1 # init to 1
What is the "meaning" of each parameter and how to organize them in self.blobs
is entirely up to you.
How are trainable parameters being "trained"?
This is one of the cool things about caffe (and other DNN toolkits as well), you don't need to worry about it!
What do you need to do? All you need is to compute the gradient of the loss w.r.t the parameters and store it in self.blobs[i].diff
. Once the gradients are updated, caffe's internals takes care of updating the parameters according to the gradients/learning rate/momentum/update policy etc.
So,
You must have a non-trivial backward
method for your layer
backward(self, top, propagate_down, bottom):
self.blobs[0].diff[...] = # diff of parameters
self.blobs[1].diff[...] = # diff for all the blobs
You might want to test your implementation of the layer, once you complete it.
Have a look at this PR for a numerical test of the gradients.
来源:https://stackoverflow.com/questions/44418828/how-should-i-use-blobs-in-a-caffe-python-layer-and-when-does-their-training-tak