tensorflow-datasets

How do you send arguments to a generator function using tf.data.Dataset.from_generator()?

最后都变了- 提交于 2021-02-18 10:24:05
问题 I would like to create a number of tf.data.Dataset using the from_generator() function. I would like to send an argument to the generator function ( raw_data_gen ). The idea is that the generator function will yield different data depending on the argument sent. In this way I would like raw_data_gen to be able to provide either training, validation or test data. training_dataset = tf.data.Dataset.from_generator(raw_data_gen, (tf.float32, tf.uint8), ([None, 1], [None]), args=([1])) validation

How to use Keras' predict_on_batch in tf.data.Dataset.map()?

只谈情不闲聊 提交于 2021-02-17 04:55:54
问题 I would like to find a way to use Keras' predict_on_batch inside tf.data.Dataset.map() in TF2.0. Let's say I have a numpy dataset n_data = 10**5 my_data = np.random.random((n_data,10,1)) my_targets = np.random.randint(0,2,(n_data,1)) data = ({'x_input':my_data}, {'target':my_targets}) and a tf.keras model x_input = Input((None,1), name = 'x_input') RNN = SimpleRNN(100, name = 'RNN')(x_input) dense = Dense(1, name = 'target')(RNN) my_model = Model(inputs = [x_input], outputs = [dense]) my

Is there a queue-like dataset?

谁说我不能喝 提交于 2021-02-11 17:09:02
问题 It seems that tf.data.Dataset provides a more flexible and more sophisticated alternative to TF queues (subclasses of QueueBase ). (E.g. a TF queue cannot really be reopened after it was closed, see here, here.) (There also seems to be some downsides with Dataset , like that it runs (mostly) on CPU.) I liked the FIFOQueue . Is there some equivalent Dataset ? More specifically, I have one (or multiple) background thread which would get data from somewhere (might not be TF related), and this

Is there a queue-like dataset?

我怕爱的太早我们不能终老 提交于 2021-02-11 17:04:20
问题 It seems that tf.data.Dataset provides a more flexible and more sophisticated alternative to TF queues (subclasses of QueueBase ). (E.g. a TF queue cannot really be reopened after it was closed, see here, here.) (There also seems to be some downsides with Dataset , like that it runs (mostly) on CPU.) I liked the FIFOQueue . Is there some equivalent Dataset ? More specifically, I have one (or multiple) background thread which would get data from somewhere (might not be TF related), and this

How to use tf.data.Dataset with kedro?

烈酒焚心 提交于 2021-02-11 14:58:23
问题 I am using tf.data.Dataset to prepare a streaming dataset which is used to train a tf.kears model. With kedro, is there a way to create a node and return the created tf.data.Dataset to use it in the next training node? The MemoryDataset will probably not work because tf.data.Dataset cannot be pickled ( deepcopy isn't possible), see also this SO question. According to issue #91 the deep copy in MemoryDataset is done to avoid modifying the data by some other node. Can someone please elaborate a

“IndexError: list index out of range” in model.fit() method when using Dataset in Tensorflow Keras classifier

六月ゝ 毕业季﹏ 提交于 2021-02-11 12:36:33
问题 I'm new in TensorFlow and I'm trying to create a classifier using Keras. My training data is spitted into two files: - one with training examples, each example is a vector of 64 floats - second with labels, each label is an int within range (0,..,SIZE) (SIZE is 100) and it describes a class. Both files are quire large and I can't fit them into memory so I've used tf.Dataset. I create two Datasets (one for features and one for labels) and them merge them using tf.data.Dataset.zip(). However

Tensorflow-IO Dataset input pipeline with very large HDF5 files

孤街醉人 提交于 2021-02-10 12:19:31
问题 I have very big training (30Gb) files. Since all the data does not fit in my available RAM, I want to read the data by batch. I saw that there is Tensorflow-io package which implemented a way to read HDF5 into Tensorflow this way thanks to the function tfio.IODataset.from_hdf5() Then, since tf.keras.model.fit() takes a tf.data.Dataset as input containing both samples and targets, I need to zip my X and Y together and then use .batch and .prefetch to load in memory just the necessary data. For

Tensorflow-IO Dataset input pipeline with very large HDF5 files

亡梦爱人 提交于 2021-02-10 12:18:26
问题 I have very big training (30Gb) files. Since all the data does not fit in my available RAM, I want to read the data by batch. I saw that there is Tensorflow-io package which implemented a way to read HDF5 into Tensorflow this way thanks to the function tfio.IODataset.from_hdf5() Then, since tf.keras.model.fit() takes a tf.data.Dataset as input containing both samples and targets, I need to zip my X and Y together and then use .batch and .prefetch to load in memory just the necessary data. For

Tensorflow-IO Dataset input pipeline with very large HDF5 files

佐手、 提交于 2021-02-10 12:18:10
问题 I have very big training (30Gb) files. Since all the data does not fit in my available RAM, I want to read the data by batch. I saw that there is Tensorflow-io package which implemented a way to read HDF5 into Tensorflow this way thanks to the function tfio.IODataset.from_hdf5() Then, since tf.keras.model.fit() takes a tf.data.Dataset as input containing both samples and targets, I need to zip my X and Y together and then use .batch and .prefetch to load in memory just the necessary data. For

Loading large data into TensorFlow 2.0 without loading it on the RAM

匆匆过客 提交于 2021-02-08 08:30:31
问题 I have processed and saved a large dataset of video and audio file (about 8 to 9 GB of data) The data is saved as 2 numpy arrays, one for each modality Shapes of the files are (number_of_examples, maximum_time_length, feature_length) I want to use this data for training my Neural Network for a classification task I am using the TensorFlow 2.0 Beta version I am running all the codes on Google Colab (after installing tf-2.0 beta) Each time I loading the data in tf.data the entire RAM of the