问题
When I don't use queues, I like to tally the loss, accuracy, ppv etc during an epoch of training and submit that tf.summary at the end of every epoch.
I'm not sure how to replicate this behavior with queues. Is there a signal I can listen to for when an epoch is complete?
(version 0.9)
A typical setup goes as follows:
queue=tf.string_input_producer(num_epochs=7)
...#build graph#...
#training
try:
while not coord.should_stop():
sess.run(train_op)
except:
#file has been read num_epoch times
#do some stuff.. maybe summaries
coord.request_stop()
finally:
coord.join(threads)
So, clearly I could just set num_epoch=1 and create summaries in the except block. This would require running my entire program once per epoch and somehow it doesn't seem the most efficient.
回答1:
EDIT Changed to account for edits to the question.
An epoch is not something that is a built-in or 'known' to TensorFlow. You have to keep track of the epochs in your training loop and run the summary ops at the end of an epoch. A pseudo code like the following should work :
num_mini_batches_in_epoch = ... # something like examples_in_file / mini_batch_size
try:
while True:
for i in num_mini_batches_in_epoch:
if coord.should_stop(): raise Exception()
sess.run(train_op)
sess.run([loss_summary, accuracy_summary])
except:
#file has been read num_epoch times
#do some stuff.. maybe summaries
coord.request_stop()
finally:
coord.join(threads)
来源:https://stackoverflow.com/questions/38232417/tensorflow-fully-connected-control-flow-per-n-epoch-summary