Can not load model weights saved to GCP with keras.save_weights. Need to transfer to new bucket to load weights

余生颓废 提交于 2021-02-11 15:36:00

问题


I am training on Google Colab with data and model weights loaded from/save to GCP. I am using Keras callbacks to save the weights to GCP. This is what the callback looks like

callbacks = [tf.keras.callbacks.ModelCheckpoint(filepath='gs://mybucket/'+ 'savename' + '_loss_{loss:.2f}',
                                                monitor='loss',
                                                verbose=1,
                                                save_weights_only=True,
                                                save_freq='epoch')]

The training saves the model weights successfully to my GCP bucket, but when I try to load those weights in a new session, the cell just hangs, I waited for an hour before giving up. At first I thought it was related to the TPU/GPU training, but I also tried on a CPU instance, and the weights would not load.

Here is the code sample

model = get_model(**config)
model.load_weights('gs://mybucket/savename_loss_0.13')

To check the model weights, I downloaded them directly into my colab

!gsutil -m cp -r gs://mybucket/* /content/

And loaded the model with the weights saved on colab, and it worked. So then I was guessing that the bucket permissions were somehow different, but they weren't.

For all the files, the Storage class is standard, and access control is standard.

So I created a new bucket, and uploaded the weights from my colab into the new bucket, and from this new bucket, I was able to load the weights within a few seconds. I confirmed the weights were correct. To be clear, these are the same weights I downloaded from the original bucket into google colab's local enviroment.

I was then wondering if somehow the permissions between the old and new buckets were different, but in the console, they all have the same attributions, region, encryption, etc. Both buckets have the exact same attributes as well.

The only thing I can think of is that there some attribute that I am overlooking, which is not displays in either the bucket console or file console.

来源:https://stackoverflow.com/questions/62866698/can-not-load-model-weights-saved-to-gcp-with-keras-save-weights-need-to-transfe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!