Building custom Caffe layer in python

大兔子大兔子 提交于 2019-11-28 03:43:19
Shai

You asked a lot of questions here, I'll give you some highlights and pointers that I hope will clarify matters for you. I will not explicitly answer all your questions.

It seems like you are most confused about the the difference between a blob and a layer's input/output. Indeed most of the layers has a single blob as input and a single blob as output, but it is not always the case. Consider a loss layer: it has two inputs: predictions and ground truth labels. So, in this case bottom is a vector of length 2(!) with bottom[0] being a (4-D) blob representing predictions, while bottom[1] is another blob with the labels. Thus, when constructing such a layer you must ascertain that you have exactly (hard coded) 2 input blobs (see e.g., ExactNumBottomBlobs() in AccuracyLayer definition).

The same goes for top blobs as well: indeed in most cases there is a single top for each layer, but it's not always the case (see e.g., AccuracyLayer). Therefore, top is also a vector of 4-D blobs, one for each top of the layer. Most of the time there would be a single element in that vector, but sometimes you might find more than one.

I believe this covers your questions 1,3,4 and 6.

As of reshape() (Q.2) this function is not called every forward pass, it is called only when net is setup to allocate space for inputs/outputs and params.
Occasionally, you might want to change input size for your net (e.g., for detection nets) then you need to call reshape() for all layers of the net to accommodate the new input size.

As for propagate_down parameter (Q.7): since a layer may have more than one bottom you would need, in principle, to pass the gradient to all bottoms during backprop. However, what is the meaning of a gradient to the label bottom of a loss layer? There are cases when you do not want to propagate to all bottoms: this is what this flag is for. (here's an example with a loss layer with three bottoms that expect gradient to all of them).

For more information, see this "Python" layer tutorial.

Why it should be 2?

That specific gist is talking about the Euclidian loss layer. Euclidian loss is the mean square error between 2 vectors. Hence there must be 2 vectors in the input blob to this layer. The length of each vector must be same because it is element-wise difference. You can see this check in the reshape method.

Thanks.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!