I am pretty new to tensorflow. I used to use theano for deep learning development. I notice a difference between these two, that is where input data can be stored.
I
If your data fits on the GPU, you can load it into a constant on GPU from e.g. a numpy array:
with tf.device('/gpu:0'):
tensorflow_dataset = tf.constant(numpy_dataset)
One way to extract minibatches would be to slice that array at each step instead of feeding it using tf.slice:
batch = tf.slice(tensorflow_dataset, [index, 0], [batch_size, -1])
There are many possible variations around that theme, including using queues to prefetch the data to GPU dynamically.