问题
Backpropagation calculates dW
(weight delta) per weight per pattern, so it's straightforward how to modify weights when doing stochastic training. How do I use it for batch training, though? Simply accumluate dW
over the entire training set and then apply the modfication, or is there more to it?
回答1:
Yes, just accumluate dW over the entire training set. At least that is how I coded it back in grad school...
回答2:
You can do a lot with the different gradients from the different samples. That includes higher order information (approximate 2nd derivative) or conjugate gradient or natural gradient or ... :)
来源:https://stackoverflow.com/questions/2139940/backpropagation-and-batch-training