What's the difference between “hidden” and “output” in PyTorch LSTM?

后端未结

关注

 4  1391

孤街浪徒 2020-12-12 09:47

I\'m having trouble understanding the documentation for PyTorch\'s LSTM module (and also RNN and GRU, which are similar). Regarding the outputs, it says:

4条回答

春和景丽 (楼主)

2020-12-12 10:13

I just verified some of this using code, and its indeed correct that if it's a depth 1 LSTM, then h_n is the same as the last value of the "output". (this will not be true for > 1 depth LSTM though as explained above by @nnnmmm)

So, basically the "output" we get after applying LSTM is not the same as o_t as defined in the documentation, rather it is h_t.

import torch
import torch.nn as nn

torch.manual_seed(0)
model = nn.LSTM( input_size = 1, hidden_size = 50, num_layers  = 1 )
x = torch.rand( 50, 1, 1)
output, (hn, cn) = model(x)

Now one can check that output[-1] and hn both have the same value as follows

tensor([[ 0.1140, -0.0600, -0.0540,  0.1492, -0.0339, -0.0150, -0.0486,  0.0188,
          0.0504,  0.0595, -0.0176, -0.0035,  0.0384, -0.0274,  0.1076,  0.0843,
         -0.0443,  0.0218, -0.0093,  0.0002,  0.1335,  0.0926,  0.0101, -0.1300,
         -0.1141,  0.0072, -0.0142,  0.0018,  0.0071,  0.0247,  0.0262,  0.0109,
          0.0374,  0.0366,  0.0017,  0.0466,  0.0063,  0.0295,  0.0536,  0.0339,
          0.0528, -0.0305,  0.0243, -0.0324,  0.0045, -0.1108, -0.0041, -0.1043,
         -0.0141, -0.1222]], grad_fn=)

0 讨论(0)

查看其它4个回答