parallelising tf.data.Dataset.from_generator

后端 未结 3 813
孤独总比滥情好
孤独总比滥情好 2020-12-01 01:24

I have a non trivial input pipeline that from_generator is perfect for...

dataset = tf.data.Dataset.from         


        
3条回答
  •  执念已碎
    2020-12-01 02:16

    Limiting the work done in the generator to a minimum and parallelizing the expensive processing using a map is sensible.

    Alternatively, you can "join" multiple generators using parallel_interleave as follows:

    def generator(n):
      # returns n-th generator function
    
    def dataset(n):
      return tf.data.Dataset.from_generator(generator(n))
    
    ds = tf.data.Dataset.range(N).apply(tf.contrib.data.parallel_interleave(dataset, cycle_lenght=N))
    
    # where N is the number of generators you use
    

提交回复
热议问题