Getting good mixing with many input datafiles in tensorflow

☆樱花仙子☆ 提交于 2019-12-01 06:23:32

Yes - what you want is to use a combination of two things.

First, randomly shuffle the order in which you input your datafiles, by reading from them using a tf.train.string_input_producer with shuffle=True that feeds into whatever input method you use (if you can put your examples into tf.Example proto format, that's easy to use with parse_example). To be very clear, you put the list of filenames in the string_input_producer and then read them with another method such as read_file, etc.

Second, you need to mix at a finer granularity. You can accomplish this by feeding the input examples into a tf.train.shuffle_batch node with a large capacity and large value of min_after_dequeue. One particularly nice way is to use a shuffle_batch_join that receives input from multiple files, so that you get a lot of mixing. Set the capacity of the batch big enough to mix well without exhausting your RAM. Tens of thousands of examples usually works pretty well.

Keep in mind that the batch functions add a QueueRunner to the QUEUE_RUNNERS collection, so you need to run tf.train.start_queue_runners()

In your case it is not a problem to do some preprocessing and create one file out of all the files you have. For this type of games, where the history is not important and the position determines everything your dataset can consist just from position -> next_move.


For a more broad case TF provides everything to allow the shuffling you want. There are two types shuffling which serve different purposes and shuffle different things:

  • tf.train.string_input_producer shuffle: Boolean. If true, the strings are randomly shuffled within each epoch.. So if you have a few files ['file1', 'file2', ..., 'filen'] this randomly selects a file from this list. If case of false, the files follow one after each other.
  • tf.train.shuffle_batch Creates batches by randomly shuffling tensors. So it takes batch_size tensors from your queue (you will need to create a queue with tf.train.start_queue_runners ) and shuffles them.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!