Tensorflow-IO Dataset input pipeline with very large HDF5 files
问题 I have very big training (30Gb) files. Since all the data does not fit in my available RAM, I want to read the data by batch. I saw that there is Tensorflow-io package which implemented a way to read HDF5 into Tensorflow this way thanks to the function tfio.IODataset.from_hdf5() Then, since tf.keras.model.fit() takes a tf.data.Dataset as input containing both samples and targets, I need to zip my X and Y together and then use .batch and .prefetch to load in memory just the necessary data. For