I am a little confused about how should I use/insert "BatchNorm"
layer in my models.
I see several different approaches, for instance:
ResNets: "BatchNorm"
+"Scale"
(no parameter sharing)
"BatchNorm"
layer is followed immediately with "Scale"
layer:
layer { bottom: "res2a_branch1" top: "res2a_branch1" name: "bn2a_branch1" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "res2a_branch1" top: "res2a_branch1" name: "scale2a_branch1" type: "Scale" scale_param { bias_term: true } }
cifar10 example: only "BatchNorm"
In the cifar10 example provided with caffe, "BatchNorm"
is used without any "Scale"
following it:
layer { name: "bn1" type: "BatchNorm" bottom: "pool1" top: "bn1" param { lr_mult: 0 } param { lr_mult: 0 } param { lr_mult: 0 } }
cifar10 Different batch_norm_param
for TRAIN
and TEST
batch_norm_param: use_global_scale
is changed between TRAIN
and TEST
phase:
layer { name: "bn1" type: "BatchNorm" bottom: "pool1" top: "bn1" batch_norm_param { use_global_stats: false } param { lr_mult: 0 } param { lr_mult: 0 } param { lr_mult: 0 } include { phase: TRAIN } } layer { name: "bn1" type: "BatchNorm" bottom: "pool1" top: "bn1" batch_norm_param { use_global_stats: true } param { lr_mult: 0 } param { lr_mult: 0 } param { lr_mult: 0 } include { phase: TEST } }
So what should it be?
How should one use"BatchNorm"
layer in caffe?