Keras ImageDataGenerator for Cloud ML Engine

£可爱£侵袭症+ 提交于 2019-12-04 17:22:58

If you check the source code, you see that the error arises when Keras (or TF) is trying to construct the classes from your directories. Since you are giving it a GCS-directory (gs://), this will not work. You can bypass this error by providing the classes argument yourself, e.g. in the following way:

def get_classes(file_dir):
    if not file_dir.startswith("gs://"):
      classes = [c.replace('/', '') for c in os.listdir(file_dir)]
    else:
      bucket_name = file_dir.replace('gs://', '').split('/')[0]
      prefix = file_dir.replace("gs://"+bucket_name+'/', '')
      if not prefix.endswith("/"):
          prefix += "/"

      client = storage.Client()
      bucket = client.get_bucket(bucket_name)

      iterator = bucket.list_blobs(delimiter="/", prefix=prefix)
      response = iterator.get_next_page_response()
      classes = [c.replace('/','') for c in response['prefixes']]

    return classes

Passing these classes to flow_from_directory will solve your error, but it will not recognize the files itself (I now get e.g. Found 0 images belonging to 2 classes.).

The only 'direct' workaround that I find, is to copy your files to local disk and read them from there. It would be great to have another solution (since e.g. in case of images, it can take long to copy).

Other resources also suggest to use TensorFlow's file_io function when interacting with GCS from Cloud ML Engine, but this will require you to fully rewrite flow_from_directory yourself in this case.

In addition to dumkar's solution. One can try to work with a h5 dataset using Tensorflow's file_io.

with file_io.FileIO(os.path.join(data_dir, data_file_name), mode='r') as input_f:
        with file_io.FileIO('dataset.hdf5', mode='w+') as output_f:
                output_f.write(input_f.read())
dataset = h5py.File('dataset.hdf5', 'r')

This allows you to have a temporary local copy of a file stored in a GC bucket. Here is a good gist by aloisg that demonstrates how you can create the h5 file from your image dataset : https://gist.github.com/aloisg/ac83160edf8a543b5ee6.

You can now retrieve X_train, y_train, X_eval and y_eval from the dataset to feed the keras model easily.

It is hard to help you as your current post is. However, checking the error you get we can see it is being thrown by os.listdir(), so it is not a Keras problem per se.

This is probably due to your directory not being absolute path or well that it does not exist (maybe a typo or similar). If you update your question with more information I can help you go deeper into this.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!