As I am browsing through some past projects, I got perplexed over a phenomenon: my training\'s first few epochs always have the same validation accuracy (sometimes the same