My understanding is that it is good practice to shuffle training samples for each epoch so that each mini-batch contains a nice random sample of the entire dataset. If I convert
Actually now you don't have to worry about shuffling before saving to TFRecords. It's because (currently) recommended method to read TFRecords uses tf.data.TFRecordDataset which implements .shuffle() method.