I\'m a bit confused by the cross entropy loss in PyTorch.
Considering this example:
import torch
import
I would like to add an important note, as this often leads to confusion.
Softmax is not a loss function, nor is it really an activation function. It has a very specific task: It is used for multi-class classification to normalize the scores for the given classes. By doing so we get probabilities for each class that sum up to 1.
Softmax is combined with Cross-Entropy-Loss to calculate the loss of a model.
Unfortunately, because this combination is so common, it is often abbreviated. Some are using the term Softmax-Loss, whereas PyTorch calls it only Cross-Entropy-Loss.