why do we “pack” the sequences in pytorch?
问题 I was trying to replicate How to use packing for variable-length sequence inputs for rnn but I guess I first need to understand why we need to "pack" the sequence. I understand why we need to "pad" them but why is "packing" ( through pack_padded_sequence ) necessary? Any high-level explanation would be appreciated! 回答1: I have stumbled upon this problem too and below is what I figured out. When training RNN (LSTM or GRU or vanilla-RNN), it is difficult to batch the variable length sequences.