tensorflow-datasets

TensorFlow Dataset `.map` - Is it possible to ignore errors?

纵然是瞬间 提交于 2019-12-01 01:29:55
Short version: When using Dataset map operations, is it possible to specify that any 'rows' where the map invocation results in an error are quietly filtered out rather than having the error bubble up and kill the whole session? Specifics: I have an input pipeline set up that (more or less) does the following: reads a set of file paths of images stored locally (images of varying dimensions) reads a suggested set of 'bounding boxes' from a csv Produces the set of all image path to bounding box combinations Reads and decodes the image then produces the set of 'cropped' images for each of these

Oversampling functionality in Tensorflow dataset API

旧街凉风 提交于 2019-11-30 20:21:27
I would like to ask if current API of datasets allows for implementation of oversampling algorithm? I deal with highly imbalanced class problem. I was thinking that it would be nice to oversample specific classes during dataset parsing i.e. online generation. I've seen the implementation for rejection_resample function, however this removes samples instead of duplicating them and its slows down batch generation (when target distribution is much different then initial one). The thing I would like to achieve is: to take an example, look at its class probability decide if duplicate it or not.

TensorFlow Dataset `.map` - Is it possible to ignore errors?

故事扮演 提交于 2019-11-30 20:01:47
问题 Short version: When using Dataset map operations, is it possible to specify that any 'rows' where the map invocation results in an error are quietly filtered out rather than having the error bubble up and kill the whole session? Specifics: I have an input pipeline set up that (more or less) does the following: reads a set of file paths of images stored locally (images of varying dimensions) reads a suggested set of 'bounding boxes' from a csv Produces the set of all image path to bounding box

Tensorflow 1.10 TFRecordDataset - recovering TFRecords

戏子无情 提交于 2019-11-30 19:26:57
Notes: this question extends upon a previous question of mine . In that question I ask about the best way to store some dummy data as Example and SequenceExample seeking to know which is better for data similar to dummy data provided. I provide both explicit formulations of the Example and SequenceExample construction as well as, in the answers, a programatic way to do so. Because this is still a lot of code, I am providing a Colab (interactive jupyter notebook hosted by google) file where you can try the code out yourself to assist. All the necessary code is there and it is generously

How to use tensorflow's Dataset API Iterator as an input of a (recurrent) neural network?

℡╲_俬逩灬. 提交于 2019-11-30 16:10:13
问题 When using the tensorflow's Dataset API Iterator, my goal is to define an RNN that operates on the iterator's get_next() tensors as its input (see (1) in the code). However, simply defining the dynamic_rnn with get_next() as its input results in an error: ValueError: Initializer for variable rnn/basic_lstm_cell/kernel/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer. Now I know that

tf.contrib.data.Dataset seems does not support SparseTensor

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-30 15:13:07
I generated a pascal voc 2007 tfrecords file using the code in tensorflow object detection API. I use tf.contrib.data.Dataset API to read data from tfrecords. I tried mehtod without tf.contrib.data.Dataset API, and the code can run without any error, but when changed to tf.contrib.data.Dataset API it can not work correctly. The code without tf.contrib.data.Dataset : import tensorflow as tf if __name__ == '__main__': slim_example_decoder = tf.contrib.slim.tfexample_decoder features = {"image/height": tf.FixedLenFeature((), tf.int64, default_value=1), "image/width": tf.FixedLenFeature((), tf

TypeError: unsupported callable using Dataset with estimator input_fn

社会主义新天地 提交于 2019-11-30 09:19:43
问题 I'm trying to convert the Iris tutorial (https://www.tensorflow.org/get_started/estimator) to read training data from .png files instead of .csv. It works using numpy_input_fn but not when I make it from a Dataset . I think input_fn() is returning the wrong type but don't really understand what it should be and how to make it that. The error is: File "iris_minimal.py", line 27, in <module> model_fn().train(input_fn(), steps=1) ... raise TypeError('unsupported callable') from ex TypeError:

Feature Columns Embedding lookup

和自甴很熟 提交于 2019-11-30 07:55:44
I have been working with the datasets and feature_columns in tensorflow( https://developers.googleblog.com/2017/11/introducing-tensorflow-feature-columns.html ). I see they have categorical features and a way to create embedding features from categorical features. But when working on nlp tasks, how do we create a single embedding lookup? For eg: Consider text classification task. Every data point would have a lot of textual columns but they would not be separate categories. How do we create and use a single embedding lookup for all these columns? Below is an example of how I am currently using

Restoring a Tensorflow model that uses Iterators

 ̄綄美尐妖づ 提交于 2019-11-30 06:39:29
问题 I have a model that's trains my network using an Iterator; following the new Dataset API pipeline model that's now recommended by Google. I read tfrecord files, feed data to the network, train nicely, and all is going well, I save my model in the end of the training so I can run Inference on it later. A simplified version of the code is as following: """ Training and saving """ training_dataset = tf.contrib.data.TFRecordDataset(training_record) training_dataset = training_dataset.map(ds._path

Restoring a Tensorflow model that uses Iterators

浪子不回头ぞ 提交于 2019-11-30 05:12:16
I have a model that's trains my network using an Iterator; following the new Dataset API pipeline model that's now recommended by Google. I read tfrecord files, feed data to the network, train nicely, and all is going well, I save my model in the end of the training so I can run Inference on it later. A simplified version of the code is as following: """ Training and saving """ training_dataset = tf.contrib.data.TFRecordDataset(training_record) training_dataset = training_dataset.map(ds._path_records_parser) training_dataset = training_dataset.batch(BATCH_SIZE) with tf.name_scope("iterators"):