I\'m trying to implement a program that compares LSTM\'s performance vs GRU\'s performance for word prediction. I am using the same parameters for both of them, however whil