Im working with Pytorch\'s nn.TransformerEncoder module. I got input samples with (as normal) the shape (batch-size, seq-len, emb-dim). All samples
nn.TransformerEncoder
batch-size, seq-len, emb-dim