A few days ago I finished writing a word prediction program that tests both LSTM and GRU models on a given dataset. I test 4 models - 2 LSTM models and 2 GRU models. I wrote