Keras: Load checkpoint weights HDF5 generated by multiple GPUs

梦想与她 提交于 2019-12-23 18:31:42

问题


Checkpoint snippet:

checkpointer = ModelCheckpoint(filepath=os.path.join(savedir, "mid/weights.{epoch:02d}.hd5"), monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False)
hist = model.fit_generator(
    gen.generate(batch_size = batch_size, nb_classes=nb_classes), samples_per_epoch=593920, nb_epoch=nb_epoch, verbose=1, callbacks=[checkpointer], validation_data = gen.vld_generate(VLD_PATH, batch_size = 64, nb_classes=nb_classes), nb_val_samples=10000
)

I trained my model on a multiple GPU host which dumps mid files in HDF5 format. When I loaded them on a single GPU machine with keras.load_weights('mid'), an error was raised:

Using TensorFlow backend.
Traceback (most recent call last):
  File "server.py", line 171, in <module>
    model = load_model_and_weights('zhch.yml', '7_weights.52.hd5')
  File "server.py", line 16, in load_model_and_weights
    model.load_weights(os.path.join('model', weights_name))
  File "/home/lz/code/ProjectGo/meta/project/libpolicy-server/.virtualenv/lib/python3.5/site-packages/keras/engine/topology.py", line 2701, in load_weights
    self.load_weights_from_hdf5_group(f)
  File "/home/lz/code/ProjectGo/meta/project/libpolicy-server/.virtualenv/lib/python3.5/site-packages/keras/engine/topology.py", line 2753, in load_weights_from_hdf5_group
    str(len(flattened_layers)) + ' layers.')
ValueError: You are trying to load a weight file containing 1 layers into a model with 21 layers.

Is there any way to load checkpoint weights generated by multiple GPUs on a single GPU machine? It seems that no issue of Keras discussed this problem thus any help would be appreciated.


回答1:


You can load your model on a single GPU like this:

from keras.models import load_model

multi_gpus_model = load_model('mid')
origin_model = multi_gpus_model.layers[-2]  # you can use multi_gpus_model.summary() to see the layer of the original model
origin_model.save_weights('single_gpu_model.hdf5')

'single_gpu_model.hdf5' is the file that you can load to the single GPU machine model.




回答2:


Try this function:

def keras_model_reassign_weights(model_cpu,model_gpu):
    weights_temp ={}
    print('_'*5,'Collecting weights from GPU model','_'*5)
    for layer in model_gpu.layers:
        try:
            for layer_unw in layer.layers:
                #print('Weights extracted for: ',layer_unw.name)
                weights_temp[layer_unw.name] = layer_unw.get_weights()
            break
        except:
            print('Skipped: ',layer.name)
    print('_'*5,'Writing weights to CPU model','_'*5)
    for layer in model_cpu.layers:
        try:
            layer.set_weights(weights_temp[layer.name])
            #print(layer.name,'Done!')
        except:
            print(layer.name,'weights does not set for this layer!')
    return model_cpu

But you need to load weights to your gpu model first:

#load or initialize your keras multi-gpu model
model_gpu = None 
#load or initialize your keras model with the same structure, without using keras.multi_gpu function
model_cpu = None 
#load weights into multigpu model
model_gpu.load_weights(r'gpu_model_best_checkpoint.hdf5') 
#execute function
model_cpu = keras_model_reassign_weights(model_cpu,model_gpu)
#save obtained weights for cpu model
model_cpu.save_weights(r'CPU_model.hdf5')

After transferring you can use weights with a single GPU or CPU model.



来源:https://stackoverflow.com/questions/41342098/keras-load-checkpoint-weights-hdf5-generated-by-multiple-gpus

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!