What is tensorflow.python.data.ops.dataset_ops._OptionsDataset?

蹲街弑〆低调 提交于 2020-12-01 09:46:52

问题


I am using the Transformer code from tensorflow - https://www.tensorflow.org/beta/tutorials/text/transformer

In this code, the dataset used is loaded like this -

examples, metadata = tfds.load('ted_hrlr_translate/pt_to_en', with_info=True,
                               as_supervised=True)
train_examples, val_examples = examples['train'], examples['validation']

When I check the type of train_examples using :

type(train_examples)

I get the following as output -

tensorflow.python.data.ops.dataset_ops._OptionsDataset

Now I just wanted to change some entries of the dataset that is the sentences, but I am not able to as I don't understand the type.

I am able to iterate over it using :

for data in train_examples:
    print(data,type(data))

And type of data is -

<class 'tuple'>

Finally what I want is to replace some of these tuples with my own data. Can someone tell me how to do this or give me some details about this type tensorflow.python.data.ops.dataset_ops._OptionsDataset.


回答1:


tensorflow.python.data.ops.dataset_ops._OptionsDataset is just another class extending the base class tf.compat.v2.data.Dataset (DatasetV2) which holds tf.data.Options along with the original tf.compat.v2.data.Dataset dataset (The Portuguese-English tuples in your case).

(tf.data.Options operates when you are using streaming functions over your dataset tf.data.Dataset.map or tf.data.Dataset.interleave)

How to view the individual elements?

I'm sure there are many ways, but one straight way would be to use the iterator in the base class:

Since examples['train'] is a type of _OptionsDataset here is iterating by calling a method from tf.compat.v2.data.Dataset

iterator = examples['train'].__iter__()
next_element = iterator.get_next()
pt = next_element[0]
en = next_element[1]
print(pt.numpy())
print(en.numpy())

Here is the output:

b'o problema \xc3\xa9 que nunca vivi l\xc3\xa1 um \xc3\xbanico dia .'
b"except , i 've never lived one day of my life there ."

Substituting with your own data:

Since you've not mentioned what you want to substitute the original dataset with, I'll assume you have a CSV/TSV file of your own specific translations. Then it should be useful to create a separate tf.compat.v2.data.Dataset object itself by calling the CSV API to read your CSV file into a dataset:

tf.data.experimental.make_csv_dataset

https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/r2/tutorials/load_data/csv.ipynb



来源:https://stackoverflow.com/questions/56820723/what-is-tensorflow-python-data-ops-dataset-ops-optionsdataset

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!