Obtaining total number of records from .tfrecords file in Tensorflow

后端 未结 4 1951
南方客
南方客 2020-12-13 10:27

Is it possible for obtain the total number of records from a .tfrecords file ? Related to this, how does one generally keep track of the number of epochs that h

4条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-13 10:41

    No it is not possible. TFRecord does not store any metadata about the data being stored inside. This file

    represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired.

    If you want, you can store this metadata manually or use a record_iterator to get the number (you will need to iterate through all the records that you have:

    sum(1 for _ in tf.python_io.tf_record_iterator(file_name))
    

    If you want to know the current epoch, you can do this either from tensorboard or by printing the number from the loop.

提交回复
热议问题