tensorflow-datasets

Tensorflow: tf.data.Dataset, Cannot batch tensors with different shapes in component 0

痞子三分冷 提交于 2019-12-06 06:27:24
问题 I have the following error in my input pipeline: tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [2,48,48,3] and element 1 had shape [27,48,48,3]. with this code dataset = tf.data.Dataset.from_generator(generator, (tf.float32, tf.int64, tf.int64, tf.float32, tf.int64, tf.float32)) dataset = dataset.batch(max_buffer_size) This is completely logical as the batch method tries to create a (batch_size,

Replacing Queue-based input pipelines with tf.data

旧时模样 提交于 2019-12-06 05:26:12
I am reading Ganegedara‘s NLP with Tensorflow. The introduction to input pipieline has the following example import tensorflow as tf import numpy as np import os # Defining the graph and session graph = tf.Graph() # Creates a graph session = tf.InteractiveSession(graph=graph) # Creates a session # The filename queue filenames = ['test%d.txt'%i for i in range(1,4)] filename_queue = tf.train.string_input_producer(filenames, capacity=3, shuffle=True,name='string_input_producer') # check if all files are there for f in filenames: if not tf.gfile.Exists(f): raise ValueError('Failed to find file: '

Output differences when changing order of batch(), shuffle() and repeat()

放肆的年华 提交于 2019-12-06 04:41:53
问题 I have created a tensorflow dataset, made it repeatable, shuffled it, divided it into batches, and have constructed an iterator to get the next batch. But when I do this, sometimes the elements are repetitive (within and among batches), especially for small datasets. Why? 回答1: Unlike what stated in your own answer, no, shuffling and then repeating won't fix your problems . The key source of your problem is that you batch, then shuffle/repeat . That way, the items in your batches will always

TensorFlow: `tf.data.Dataset.from_generator()` does not work with strings on Python 3.x

谁说我不能喝 提交于 2019-12-06 04:29:13
问题 I need to iterate through large number of image files and feed the data to tensorflow. I created a Dataset back by a generator function that produces the file path names as strings and then transform the string path to image data using map . But it failed as generating string values won't work, as shown below. Is there a fix or work around for this? 2017-12-07 15:29:05.820708: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was

How to use tf.data's initializable iterator and reinitializable interator and feed data to estimator api?

别等时光非礼了梦想. 提交于 2019-12-06 02:53:41
问题 All the official google tutorials use the one shot iterator for all the estimator api implementation, i couldnt find any documentation on how to use tf.data's initializable iterator and reinitializable interator instead of one shot iterator. Can someone kindly show me how to switch between train_data and test_data using tf.data's initializable iterator and reinitializable interator. We need to run a session to use feed dict and switch the dataset in the initializable iterator, its a low level

I get an error when importing tensorflow_datasets

泪湿孤枕 提交于 2019-12-06 02:42:58
I want to use in Jupyter (version 6.0.0) with Python3 tensorflow_datasets. Doing that results in an error message I cannot seem to fathom what the problem is. I made a new kernel for Python which should utilize the tensorflow_datasets. The following steps were taken (In anaconda using my administrator option). 1. conda info --envs 2. conda create --name py3-TF2.0 python=3 3. conda activate py3-TF2.0 4. pip install matplotlib 5. pip install tensorflow==2.0.0-alpha0 6. pip install ipykernel 7. conda install nb_conda_kernels 8. pip install tensorflow-datasets Upon closing I restarted my laptop.

Shuffling tfrecords files

强颜欢笑 提交于 2019-12-06 01:05:34
I have 5 tfrecords files, one for each object. While training I want to read data equally from all the 5 tfrecords i.e. if my batch size is 50, I should get 10 samples from 1st tfrecord file, 10 samples from the second tfrecord file and so on. Currently, it just reads sequentially from all the three files i.e. I get 50 samples from the same record. Is there a way to sample from differnt tfrecords files? I advise you to read the tutorial by @mrry on tf.data . On slide 42 he explains how to use tf.data.Dataset.interleave() to read multiple tfrecord files at the same time. For instance if you

Tensorflow - String processing in Dataset API

谁都会走 提交于 2019-12-05 17:47:37
I have .txt files in a directory of format <text>\t<label> . I am using the TextLineDataset API to consume these text records: filenames = ["/var/data/file1.txt", "/var/data/file2.txt"] dataset = tf.contrib.data.Dataset.from_tensor_slices(filenames) dataset = dataset.flat_map( lambda filename: ( tf.contrib.data.TextLineDataset(filename) .map(_parse_data))) def _parse_data(line): line_split = tf.string_split([line], '\t') features = {"raw_text": tf.string(line_split.values[0].strip().lower()), "label": tf.string_to_number(line_split.values[1], out_type=tf.int32)} parsed_features = tf.parse

Inference with a model trained with tf.Dataset

爱⌒轻易说出口 提交于 2019-12-05 14:03:08
I have trained a model using the tf.data.Dataset API, so my training code looks something like this with graph.as_default(): dataset = tf.data.TFRecordDataset(tfrecord_path) dataset = dataset.map(scale_features, num_parallel_calls=n_workers) dataset = dataset.shuffle(10000) dataset = dataset.padded_batch(batch_size, padded_shapes={...}) handle = tf.placeholder(tf.string, shape=[]) iterator = tf.data.Iterator.from_string_handle(handle, train_dataset.output_types, train_dataset.output_shapes) batch = iterator.get_next() ... # Model code ... iterator = dataset.make_initializable_iterator() with

How can I return the same batch twice from a tensorflow dataset iterator?

三世轮回 提交于 2019-12-05 12:53:00
I am converting some legacy code to use the Dataset API - this code uses feed_dict to feed one batch to the train operation (actually three times) and then recalculates the losses for display using the same batch . So I need to have an iterator that returns the exact same batch two (or several) times. Unfortunately, I can't seem to find a way of doing it with tensorflow datasets - is it possible? You can repeat individual elements of a Dataset using Dataset.flat_map() , Dataset.from_tensors() and Dataset.repeat() together. For example, to repeat elements twice: NUM_REPEATS = 2 dataset = tf