I created a VAE archictecture to encode dance frames into latent representations.
Then I planned to use LSTM to take a sequence of those latent vectors to predict th