What is `weight_decay` meta parameter in Caffe?
Looking at an example 'solver.prototxt' , posted on BVLC/caffe git, there is a training meta parameter weight_decay: 0.04 What does this meta parameter mean? And what value should I assign to it? Shai The weight_decay meta parameter govern the regularization term of the neural net. During training a regularization term is added to the network's loss to compute the backprop gradient. The weight_decay value determines how dominant this regularization term will be in the gradient computation. As a rule of thumb, the more training examples you have, the weaker this term should be. The more