TensorFlow: nr. of epochs vs. nr. of training steps

倾然丶 夕夏残阳落幕 提交于 2019-12-06 13:14:10

问题


I have recently experimented with Google's seq2seq to set up a small NMT-system. I managed to get everything working, but I am still wondering about the exact difference between the number of epochs and the number of training steps of a model.

If I am not mistaken, one epoch consists of multiple training steps and has passed once your whole training data has been processed once. I do not understand, however, the difference between the two when I look at the documentation in Google's own tutorial on NMT. Note the last line of the following snippet.

export DATA_PATH=

export VOCAB_SOURCE=${DATA_PATH}/vocab.bpe.32000
export VOCAB_TARGET=${DATA_PATH}/vocab.bpe.32000
export TRAIN_SOURCES=${DATA_PATH}/train.tok.clean.bpe.32000.en
export TRAIN_TARGETS=${DATA_PATH}/train.tok.clean.bpe.32000.de
export DEV_SOURCES=${DATA_PATH}/newstest2013.tok.bpe.32000.en
export DEV_TARGETS=${DATA_PATH}/newstest2013.tok.bpe.32000.de

export DEV_TARGETS_REF=${DATA_PATH}/newstest2013.tok.de
export TRAIN_STEPS=1000000

It seems to me as if there is only a way to define the number of training steps and not the number of epochs of your model. Is it possible that there is an overlap in terminology and that it is thus not necessary to define a number of epochs?


回答1:


An epoch consists of going through all your training samples once. And one step/iteration refers to training over a single minibatch. So if you have 1,000,000 training samples and use a batch size of 100, one epoch will be equivalent to 10,000 steps, with 100 samples per step.

A high-level neural network framework may let you set either the number of epochs or total number of training steps. But you can't set them both since one directly determines the value of the other.



来源:https://stackoverflow.com/questions/43319709/tensorflow-nr-of-epochs-vs-nr-of-training-steps

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!