OpenAI\'s REINFORCE and actor-critic example for reinforcement learning has the following code:
REINFORCE:
policy_loss = torch.cat(policy_loss).sum()
stack
Concatenates sequence of tensors along a new dimension.
cat
Concatenates the given sequence of seq tensors in the given dimension.
So if A and B are of shape (3, 4), torch.cat([A, B], dim=0) will be of shape (6, 4) and torch.stack([A, B], dim=0) will be of shape (2, 3, 4).
A
B
torch.cat([A, B], dim=0)
torch.stack([A, B], dim=0)