tensorflow-datasets

Sliding window of a batch in Tensorflow using Dataset API

﹥>﹥吖頭↗ 提交于 2019-12-04 12:14:43
问题 Is there a way to modify the composition of my images within a batch? At the moment, when I'm creating e.g. a batch with the size of 4, my batches will look like that: Batch1: [Img0 Img1 Img2 Img3] Batch2: [Img4 Img5 Img6 Img7] I need to modify the composition of my batches so that it will only shift once to the next image. Then it should look like that: Batch1: [Img0 Img1 Img2 Img3] Batch2: [Img1 Img2 Img3 Img4] Batch3: [Img2 Img3 Img4 Img5] Batch4: [Img3 Img4 Img5 Img6] Batch5: [Img4 Img5

Getting free text features into Tensorflow Canned Estimators with Dataset API via feature_columns

元气小坏坏 提交于 2019-12-04 11:51:56
I'm trying to build a model that gives reddit_score = f('subreddit','comment') Mainly this is as an example i can then build on for a work project. My code is here . My problem is that i see that canned estimators e.g. DNNLinearCombinedRegressor must have feature_columns that are part of FeatureColumn class. I have my vocab file and know that if i was to just limit to the first word of a comment i could just do something like tf.feature_column.categorical_column_with_vocabulary_file( key='comment', vocabulary_file='{}/vocab.csv'.format(INPUT_DIR) ) But if i'm passing in say first 10 words from

Train Tensorflow model with estimator (from_generator)

若如初见. 提交于 2019-12-04 11:26:45
I am trying train an estimator with a generator, but I want to feed this estimator with a package of samples for each iteration. I show the code: def _generator(): for i in range(100): feats = np.random.rand(4,2) labels = np.random.rand(4,1) yield feats, labels def input_func_gen(): shapes = ((4,2),(4,1)) dataset = tf.data.Dataset.from_generator(generator=_generator, output_types=(tf.float32, tf.float32), output_shapes=shapes) dataset = dataset.batch(4) # dataset = dataset.repeat(20) iterator = dataset.make_one_shot_iterator() features_tensors, labels = iterator.get_next() features = {'x':

How to use tf.data's initializable iterator and reinitializable interator and feed data to estimator api?

馋奶兔 提交于 2019-12-04 08:22:20
All the official google tutorials use the one shot iterator for all the estimator api implementation, i couldnt find any documentation on how to use tf.data's initializable iterator and reinitializable interator instead of one shot iterator. Can someone kindly show me how to switch between train_data and test_data using tf.data's initializable iterator and reinitializable interator. We need to run a session to use feed dict and switch the dataset in the initializable iterator, its a low level api and its confusing how to use it part of estimator api architecture PS : I did find that google

Tensorflow: create minibatch from numpy array > 2 GB

喜你入骨 提交于 2019-12-04 04:34:16
问题 I am trying to feed minibatches of numpy arrays to my model, but I'm stuck with batching. Using 'tf.train.shuffle_batch' raises an error because the 'images' array is larger than 2 GB. I tried to go around it and create placeholders, but when I try to feed the the arrays they are still represented by tf.Tensor objects. My main concern is that I defined the operations under the model class and the objects don't get called before running the session. Does anyone have an idea how to handle this

Flatten a dataset in TensorFlow

为君一笑 提交于 2019-12-04 04:10:10
问题 I am trying to convert a dataset in TensorFlow to have several single-valued tensors. The dataset currently looks like this: [12 43 64 34 45 2 13 54] [34 65 34 67 87 12 23 43] [23 53 23 1 5] ... After the transformation it should look like this: [12] [43] [64] [34] [45] [2] [13] [54] [34] [65] [34] [67] [87] [12] ... My initial idea was using flat_map on the data set and then converting each tensor to a list of tensors using reshape and unstack : output_labels = self.dataset.flat_map(convert

How to make tf.data.Dataset return all of the elements in one call?

て烟熏妆下的殇ゞ 提交于 2019-12-04 03:28:18
Is there an easy way to get the entire set of elements in a tf.data.Dataset ? i.e. I want to set batch size of the Dataset to be the size of my dataset without specifically passing it the number of elements. This would be useful for validation dataset where I want to measure accuracy on the entire dataset in one go. I'm surprised there isn't a method to get the size of a tf.data.Dataset In short, there is not a good way to get the size/length; tf.data.Dataset is built for pipelines of data, so has an iterator structure (in my understanding and according to my read of the Dataset ops code .

on the fly generation with Dataset api tensorflow

こ雲淡風輕ζ 提交于 2019-12-03 21:03:21
I have a function which produces feature and target tensors. E.g. x,t = myfunc() ##x,t tensors How can I integrate this with TensorFlow's dataset API for continuous training? Ideally I would like to use dataset to set things like batch, transformations. Edit for clarification: The problem being I would like to not just put x and t in my graph but make a dataset from them so that I can use the same dataset processing that I have implemented for (normal) finite datasets I can load into memory and feed into the same graph using an initializable iterator. Assuming x and t are tf.Tensor objects,

Input multiple files into Tensorflow dataset

余生长醉 提交于 2019-12-03 17:41:28
I have the following input_fn. def input_fn(filenames, batch_size): # Create a dataset containing the text lines. dataset = tf.data.TextLineDataset(filenames).skip(1) # Parse each line. dataset = dataset.map(_parse_line) # Shuffle, repeat, and batch the examples. dataset = dataset.shuffle(10000).repeat().batch(batch_size) # Return the dataset. return dataset It works great if filenames=['file1.csv'] or filenames=['file2.csv'] . It gives me an error if filenames=['file1.csv', 'file2.csv'] . In Tensorflow documentation it says filenames is a tf.string tensor containing one or more filenames. How

TensorFlow Custom Estimator - Restore model after small changes in model_fn

隐身守侯 提交于 2019-12-03 13:37:46
I am using tf.estimator.Estimator for developing my model, I wrote a model_fn and trained 50,000 iterations, now I want to make a small change in my model_fn , for example add a new layer. I don't want to start training from scratch, I want to restore all the old variables from the 50,000 checkpoint, and continue training from this point. When I try to do so I get a NotFoundError How can this be done with tf.estimator.Estimator ? TL;DR The easiest way to load variables from a previous checkpoint is to use the function tf.train.init_from_checkpoint() . Just one call to this function inside the