Given a predefined Keras model, I am trying to first load in pre-trained weights, then remove one to three of the models internal (non-last few) layers, and then replace it with another layer.
I can't seem to find any documentation on keras.io about to do such a thing or remove layers from a predefined model at all.
The model I am using is a good ole VGG-16 network which is instantiated in a function as shown below:
def model(self, output_shape): # Prepare image for input to model img_input = Input(shape=self._input_shape) # Block 1 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x) x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dropout(0.5)(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dropout(0.5)(x) x = Dense(output_shape, activation='softmax', name='predictions')(x) inputs = img_input # Create model. model = Model(inputs, x, name=self._name) return model
So as an example, I'd like to take the two Conv layers in Block 1 and replace them with just one Conv layer, after loading the original weights into all of the other layers.
Any ideas?
Assuming that you have a model vgg16_model
, initialized either by your function above or by keras.applications.VGG16(weights='imagenet')
. Now, you need to insert a new layer in the middle in such a way that the weights of other layers will be saved.
The idea is to disassemble the whole network to separate layers, then assemble it back. Here is the code specifically for your task:
vgg_model = applications.VGG16(include_top=True, weights='imagenet') # Disassemble layers layers = [l for l in vgg_model.layers] # Defining new convolutional layer. # Important: the number of filters should be the same! # Note: the receiptive field of two 3x3 convolutions is 5x5. new_conv = Conv2D(filters=64, kernel_size=(5, 5), name='new_conv', padding='same')(layers[0].output) # Now stack everything back # Note: If you are going to fine tune the model, do not forget to # mark other layers as un-trainable x = new_conv for i in range(3, len(layers)): layers[i].trainable = False x = layers[i](x) # Final touch result_model = Model(input=layer[0].input, output=x) result_model.summary()
And the output of the above code is:
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_50 (InputLayer) (None, 224, 224, 3) 0 _________________________________________________________________ new_conv (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 _________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________ fc1 (Dense) (None, 4096) 102764544 _________________________________________________________________ fc2 (Dense) (None, 4096) 16781312 _________________________________________________________________ predictions (Dense) (None, 1000) 4097000 ================================================================= Total params: 138,320,616 Trainable params: 1,792 Non-trainable params: 138,318,824 _________________________________________________________________