tensorflow-datasets

TensorFlow: “Cannot capture a stateful node by value” in tf.contrib.data API

这一生的挚爱 提交于 2019-11-28 03:40:20
问题 For transfer learning, one often uses a network as a feature extractor to create a dataset of features, on which another classifier is trained (e.g. a SVM). I want to implement this using the Dataset API (tf.contrib.data) and dataset.map(): # feature_extractor will create a CNN on top of the given tensor def features(feature_extractor, ...): dataset = inputs(...) # This creates a dataset of (image, label) pairs def map_example(image, label): features = feature_extractor(image, trainable=False

In Tensorflow's Dataset API how do you map one element into multiple elements?

∥☆過路亽.° 提交于 2019-11-28 02:59:17
问题 In the tensorflow Dataset pipeline I'd like to define a custom map function which takes a single input element (data sample) and returns multiple elements (data samples). The code below is my attempt, along with the desired results. I could not follow the documentation on tf.data.Dataset().flat_map() well enough to understand if it was applicable here or not. import tensorflow as tf input = [10, 20, 30] def my_map_func(i): return [[i, i+1, i+2]] # Fyi [[i], [i+1], [i+2]] throws an exception

How to input a list of lists with different sizes in tf.data.Dataset

可紊 提交于 2019-11-27 20:14:36
I have a long list of lists of integers (representing sentences, each one of different sizes) that I want to feed using the tf.data library. Each list (of the lists of list) has different length, and I get an error, which I can reproduce here: t = [[4,2], [3,4,5]] dataset = tf.data.Dataset.from_tensor_slices(t) The error I get is: ValueError: Argument must be a dense tensor: [[4, 2], [3, 4, 5]] - got shape [2], but wanted [2, 2]. Is there a way to do this? EDIT 1: Just to be clear, I don't want to pad the input list of lists (it's a list of sentences containing over a million elements, with

How does one move data to multiple GPU towers using Tensorflow's Dataset API

。_饼干妹妹 提交于 2019-11-27 17:28:20
We are running multi GPU jobs on Tensorflow and evaluating a migration from the queue based model (using the string_input_producer interface) to the new Tensorflow Dataset API. The latter appears to offer an easier way to switch between Train and Validation, concurrently. A snippet of code below shows how we are doing this. train_dataset, train_iterator = get_dataset(train_files, batch_size, epochs) val_dataset, val_iterator = get_dataset(val_files, batch_size, epochs) is_validating = tf.placeholder(dtype=bool, shape=()) next_batch = tf.cond(is_validating, lambda: val_iterator.get_next(),

parallelising tf.data.Dataset.from_generator

夙愿已清 提交于 2019-11-27 11:56:18
I have a non trivial input pipeline that from_generator is perfect for... dataset = tf.data.Dataset.from_generator(complex_img_label_generator, (tf.int32, tf.string)) dataset = dataset.batch(64) iter = dataset.make_one_shot_iterator() imgs, labels = iter.get_next() Where complex_img_label_generator dynamically generates images and returns a numpy array representing a (H, W, 3) image and a simple string label. The processing not something I can represent as reading from files and tf.image operations. My question is about how to parallise the generator? How do I have N of these generators

TensorFlow: training on my own image

為{幸葍}努か 提交于 2019-11-27 10:07:19
I am new to TensorFlow. I am looking for the help on the image recognition where I can train my own image dataset. Is there any example for training the new dataset? If you are interested in how to input your own data in TensorFlow, you can look at this tutorial . I've also written a guide with best practices for CS230 at Stanford here . New answer (with tf.data ) and with labels With the introduction of tf.data in r1.4 , we can create a batch of images without placeholders and without queues. The steps are the following: Create a list containing the filenames of the images and a corresponding

TensorFlow - Read video frames from TFRecords file

余生颓废 提交于 2019-11-27 03:30:38
问题 TLDR; my question is on how to load compressed video frames from TFRecords. I am setting up a data pipeline for training deep learning models on a large video dataset (Kinetics). For this I am using TensorFlow, more specifically the tf.data.Dataset and TFRecordDataset structures. As the dataset contains ~300k videos of 10 seconds, there is a large amount of data to deal with. During training, I want to randomly sample 64 consecutive frames from a video, therefore fast random sampling is

How does one move data to multiple GPU towers using Tensorflow's Dataset API

社会主义新天地 提交于 2019-11-26 22:33:44
问题 We are running multi GPU jobs on Tensorflow and evaluating a migration from the queue based model (using the string_input_producer interface) to the new Tensorflow Dataset API. The latter appears to offer an easier way to switch between Train and Validation, concurrently. A snippet of code below shows how we are doing this. train_dataset, train_iterator = get_dataset(train_files, batch_size, epochs) val_dataset, val_iterator = get_dataset(val_files, batch_size, epochs) is_validating = tf

Numpy to TFrecords: Is there a more simple way to handle batch inputs from tfrecords?

这一生的挚爱 提交于 2019-11-26 20:25:20
My question is about how to get batch inputs from multiple (or sharded) tfrecords. I've read the example https://github.com/tensorflow/models/blob/master/inception/inception/image_processing.py#L410 . The basic pipeline is, take the training set as as example, (1) first generate a series of tfrecords (e.g., train-000-of-005 , train-001-of-005 , ...), (2) from these filenames, generate a list and fed them into the tf.train.string_input_producer to get a queue, (3) simultaneously generate a tf.RandomShuffleQueue to do other stuff, (4) using tf.train.batch_join to generate batch inputs. I think

How to input a list of lists with different sizes in tf.data.Dataset

[亡魂溺海] 提交于 2019-11-26 20:14:06
问题 I have a long list of lists of integers (representing sentences, each one of different sizes) that I want to feed using the tf.data library. Each list (of the lists of list) has different length, and I get an error, which I can reproduce here: t = [[4,2], [3,4,5]] dataset = tf.data.Dataset.from_tensor_slices(t) The error I get is: ValueError: Argument must be a dense tensor: [[4, 2], [3, 4, 5]] - got shape [2], but wanted [2, 2]. Is there a way to do this? EDIT 1: Just to be clear, I don't