In most of the models, there is a steps parameter indicating the number of steps to run over data. But yet I see in most practical usage, we also execute t
A training step is one gradient update. In one step batch_size many examples are processed.
An epoch consists of one full cycle through the training data. This is usually many steps. As an example, if you have 2,000 images and use a batch size of 10 an epoch consists of 2,000 images / (10 images / step) = 200 steps.
If you choose our training image randomly (and independent) in each step, you normally do not call it epoch. [This is where my answer differs from the previous one. Also see my comment.]