tfrecord

how to store numpy arrays as tfrecord?

帅比萌擦擦* 提交于 2020-07-18 08:59:06
问题 I am trying to create a dataset in tfrecord format from numpy arrays. I am trying to store 2d and 3d coordinates. 2d coordinates are numpy array of shape (2,10) of type float64 3d coordinates are numpy array of shape (3,10) of type float64 this is my code: def _floats_feature(value): return tf.train.Feature(float_list=tf.train.FloatList(value=value)) train_filename = 'train.tfrecords' # address to save the TFRecords file writer = tf.python_io.TFRecordWriter(train_filename) for c in range(0

Write tfrecords from beam pipeline?

随声附和 提交于 2020-04-30 08:21:47
问题 I have some data in Map format and I want to convert them to tfrecords, using the beam pipeline. Here is my attempt to write the code. I have attempted this in python which works but I need to implement this in java as some business logic is there which I can't port to python. The corresponding working python implementation can be found here in this question. import com.google.protobuf.ByteString; import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.extensions.protobuf.ProtoCoder;

Writing tfrecords in apche_beam with java

牧云@^-^@ 提交于 2020-04-18 01:06:00
问题 How can I write the following code in java? If I have list of records/dicts in java how can I write the beam code to write them in tfrecords where tf.train.Examples are serialized. There are lot of examples to do that with python, below is one example in python, how can I write the same logic in java ? import tensorflow as tf import apache_beam as beam from apache_beam.runners.interactive import interactive_runner from apache_beam.coders import ProtoCoder class Foo(beam.DoFn): def process

Transforming the data stored in tfrecord format to become inputs to a lstm Keras model in Tensorflow and fitting the model with that data

亡梦爱人 提交于 2020-03-05 03:39:42
问题 I have a very long dataframe (25 million rows x 500 columns) which I can access as a csv file or a parquet file but I can load into the RAM of my PC. The data should be shaped appropriately in order to become input to a Keras LSTM model (Tensorflow 2), given a desired number of timestamps per sample and a desired number of samples per batch . This is my second post in this subject. I have already been given the advice to convert the data to tfrecord format. Since my original environment will

How to convert multiple parquet files into TFrecord files using SPARK?

不羁岁月 提交于 2020-02-28 17:24:08
问题 I would like to produce stratified TFrecord files from a large DataFrame based on a certain condition, for which I use write.partitionBy() . I'm also using the tensorflow-connector in SPARK, but this apparently does not work together with a write.partitionBy() operation. Therefore, I have not found another way than to try to work in two steps: Repartion the dataframe according to my condition, using partitionBy() and write the resulting partitions to parquet files. Read those parquet files to

Tensorflow TFRecord: Can't parse serialized example

梦想与她 提交于 2020-02-20 03:32:52
问题 I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it: InvalidArgumentError: Key: my_key. Can't parse serialized Example. I am not sure where I'm going wrong. Here is a minimal reproduction of the issue I cannot get past. Serialise some sample data: with tf.python_io.TFRecordWriter('train.tfrecords') as writer: for idx in range(10): example = tf.train.Example( features=tf.train.Features( feature={

Tensorflow TFRecord: Can't parse serialized example

吃可爱长大的小学妹 提交于 2020-02-20 03:25:19
问题 I am trying to follow this guide in order to serialize my input data into the TFRecord format but I keep hitting this error when trying to read it: InvalidArgumentError: Key: my_key. Can't parse serialized Example. I am not sure where I'm going wrong. Here is a minimal reproduction of the issue I cannot get past. Serialise some sample data: with tf.python_io.TFRecordWriter('train.tfrecords') as writer: for idx in range(10): example = tf.train.Example( features=tf.train.Features( feature={

How to combine a pre-trained KerasLayer from TensorFlow (v. 2) Hub and tfrecords?

て烟熏妆下的殇ゞ 提交于 2020-01-25 03:09:52
问题 I have a tfrecord with 23 classes with 35 images in each class (805 in total). My current tfrecord read function is: def read_tfrecord(serialized_example): feature_description = { 'image': tf.io.FixedLenFeature((), tf.string), 'label': tf.io.FixedLenFeature((), tf.int64), 'height': tf.io.FixedLenFeature((), tf.int64), 'width': tf.io.FixedLenFeature((), tf.int64), 'depth': tf.io.FixedLenFeature((), tf.int64) } example = tf.io.parse_single_example(serialized_example, feature_description) image

TypeError: expected bytes, Descriptor found

安稳与你 提交于 2020-01-22 13:17:27
问题 How do I fix the following error traceback regarding tf.record ? (tensorflow1) c:\tensorflow1\models\research\object_detection>python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record --image_dir=(image directory) Traceback (most recent call last): File "generate_tfrecord.py", line 17, in <module> import tensorflow as tf File "C:\Users\Dell-Oguz\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\__init__.py", line 24, in <module> from tensorflow

TypeError: expected bytes, Descriptor found

让人想犯罪 __ 提交于 2020-01-22 13:17:10
问题 How do I fix the following error traceback regarding tf.record ? (tensorflow1) c:\tensorflow1\models\research\object_detection>python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record --image_dir=(image directory) Traceback (most recent call last): File "generate_tfrecord.py", line 17, in <module> import tensorflow as tf File "C:\Users\Dell-Oguz\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\__init__.py", line 24, in <module> from tensorflow