Pytorch LSTM: Target Dimension in Calculating Cross Entropy Loss

时光怂恿深爱的人放手 提交于 2019-12-23 13:01:12

问题


I've been trying to get an LSTM (LSTM followed by a linear layer in a custom model), working in Pytorch, but was getting the following error when calculating the loss:

Assertion cur_target >= 0 && cur_target < n_classes' failed.

I defined the loss function with:

criterion = nn.CrossEntropyLoss()

and then called with

loss += criterion(output, target)

I was giving the target with dimensions [sequence_length, number_of_classes], and output has dimensions [sequence_length, 1, number_of_classes].

The examples I was following seemed to be doing the same thing, but it was different on the Pytorch docs on cross entropy loss.

The docs say the target should be of dimension (N), where each value is 0 ≤ targets[i] ≤ C−1 and C is the number of classes. I changed the target to be in that form, but now I'm getting an error saying (The sequence length is 75, and there are 55 classes):

Expected target size (75, 55), got torch.Size([75])

I've tried looking at solutions for both errors, but still can't get this working properly. I'm confused as to the proper dimensions of target, as well as the actual meaning behind the first error (different searches gave very different meanings for the error, none of the fixes worked).

Thanks


回答1:


You can use squeeze() on your output tensor, this returns a tensor with all the dimensions of size 1 removed.

This short code uses the shapes you mentioned in your question:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()
loss = criterion(output, target)
print(loss)

Results in the error you described:

ValueError: Expected target size (75, 55), got torch.Size([75])

So using squeeze() on your output tensor solves your problem by getting it to correct shape.

Example with corrected shape:

sequence_length   = 75
number_of_classes = 55
# creates random tensor of your output shape
output = torch.rand(sequence_length, 1, number_of_classes)
# creates tensor with random targets
target = torch.randint(55, (75,)).long()

# define loss function and calculate loss
criterion = nn.CrossEntropyLoss()

# apply squeeze() on output tensor to change shape form [75, 1, 55] to [75, 55]
loss = criterion(output.squeeze(), target)
print(loss)

Output:

tensor(4.0442)

Using squeeze() changes your tensor shape from [75, 1, 55] to [75, 55] so it that output and target shape matches!

You can also use other methods to reshape your tensor, it is just important that you have the shape of [sequence_length, number_of_classes] instead of [sequence_length, 1, number_of_classes].

Your targets should be a LongTensor resp. a tensor of type torch.long containing the classes. Shape here is [sequence_length].

Edit:
Shapes from above example when passing to cross-entropy function:

Outputs: torch.Size([75, 55])
Targets: torch.Size([75])


Here is a more general example what outputs and targets should look like for CE. In this case we assume we have 5 different target classes, there are three examples for sequences of length 1, 2 and 3:

# init CE Loss function
criterion = nn.CrossEntropyLoss()

# sequence of length 1
output = torch.rand(1, 5)
# in this case the 1th class is our target, index of 1th class is 0
target = torch.LongTensor([0])
loss = criterion(output, target)
print('Sequence of length 1:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 2
output = torch.rand(2, 5)
# targets are here 1th class for the first element and 2th class for the second element
target = torch.LongTensor([0, 1])
loss = criterion(output, target)
print('\nSequence of length 2:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

# sequence of length 3
output = torch.rand(3, 5)
# targets here 1th class, 2th class and 2th class again for the last element of the sequence
target = torch.LongTensor([0, 1, 1])
loss = criterion(output, target)
print('\nSequence of length 3:')
print('Output:', output, 'shape:', output.shape)
print('Target:', target, 'shape:', target.shape)
print('Loss:', loss)

Output:

Sequence of length 1:
Output: tensor([[ 0.1956,  0.0395,  0.6564,  0.4000,  0.2875]]) shape: torch.Size([1, 5])
Target: tensor([ 0]) shape: torch.Size([1])
Loss: tensor(1.7516)

Sequence of length 2:
Output: tensor([[ 0.9905,  0.2267,  0.7583,  0.4865,  0.3220],
        [ 0.8073,  0.1803,  0.5290,  0.3179,  0.2746]]) shape: torch.Size([2, 5])
Target: tensor([ 0,  1]) shape: torch.Size([2])
Loss: tensor(1.5469)

Sequence of length 3:
Output: tensor([[ 0.8497,  0.2728,  0.3329,  0.2278,  0.1459],
        [ 0.4899,  0.2487,  0.4730,  0.9970,  0.1350],
        [ 0.0869,  0.9306,  0.1526,  0.2206,  0.6328]]) shape: torch.Size([3, 5])
Target: tensor([ 0,  1,  1]) shape: torch.Size([3])
Loss: tensor(1.3918)

I hope this helps!



来源:https://stackoverflow.com/questions/53455780/pytorch-lstm-target-dimension-in-calculating-cross-entropy-loss

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!