What is the difference between backpropagation and reverse-mode autodiff?

后端 未结 3 809
无人共我
无人共我 2021-02-13 23:20

Going through this book, I am familiar with the following:

For each training instance the backpropagation algorithm first makes a prediction (forward pa

3条回答
  •  孤城傲影
    2021-02-14 00:04

    The most important distinction between backpropagation and reverse-mode AD is that reverse-mode AD computes the vector-Jacobian product of a vector valued function from R^n -> R^m, while backpropagation computes the gradient of a scalar valued function from R^n -> R. Backpropagation is therefore a subset of reverse-mode AD.

    When we train neural networks, we always have a scalar-valued loss function, so we are always using backpropagation. Since backprop is a subset of reverse-mode AD, then we are also using reverse-mode AD when we train a neural network.

    Whether or not backpropagation takes the more general definition of reverse-mode AD as applied to a scalar loss function, or the more specific definition of reverse-mode AD as applied to a scalar loss function for training neural networks is a matter of personal taste. It's a word that has slightly different meaning in different contexts, but is most commonly used in the machine learning community to talk about computing gradients of neural network parameters using a scalar loss function.

    For completeness: Sometimes reverse-mode AD can compute the full Jacobian on a single reverse pass, not just the vector-Jacobian product. Also, the vector Jacobian product for a scalar function where the vector is the vector [1.0] is the same as the gradient.

提交回复
热议问题