Add a resizing layer to a keras sequential model

后端 未结 5 880
失恋的感觉
失恋的感觉 2020-12-17 18:02

How can I add a resizing layer to

model = Sequential()

using

model.add(...)

To resize an image from sha

相关标签:
5条回答
  • 2020-12-17 18:38

    I think you should consider using tensorflow's resize_images layer.

    https://www.tensorflow.org/api_docs/python/tf/image/resize_images

    It appears keras does not include this, and perhaps because the feature does not exist in theano. I have written a custom keras layer that does the same. It's a quick hack, so it might not work well in your case.

    import keras
    import keras.backend as K
    from keras.utils import conv_utils
    from keras.engine import InputSpec
    from keras.engine import Layer
    from tensorflow import image as tfi
    
    class ResizeImages(Layer):
        """Resize Images to a specified size
    
        # Arguments
            output_size: Size of output layer width and height
            data_format: A string,
                one of `channels_last` (default) or `channels_first`.
                The ordering of the dimensions in the inputs.
                `channels_last` corresponds to inputs with shape
                `(batch, height, width, channels)` while `channels_first`
                corresponds to inputs with shape
                `(batch, channels, height, width)`.
                It defaults to the `image_data_format` value found in your
                Keras config file at `~/.keras/keras.json`.
                If you never set it, then it will be "channels_last".
    
        # Input shape
            - If `data_format='channels_last'`:
                4D tensor with shape:
                `(batch_size, rows, cols, channels)`
            - If `data_format='channels_first'`:
                4D tensor with shape:
                `(batch_size, channels, rows, cols)`
    
        # Output shape
            - If `data_format='channels_last'`:
                4D tensor with shape:
                `(batch_size, pooled_rows, pooled_cols, channels)`
            - If `data_format='channels_first'`:
                4D tensor with shape:
                `(batch_size, channels, pooled_rows, pooled_cols)`
        """
        def __init__(self, output_dim=(1, 1), data_format=None, **kwargs):
            super(ResizeImages, self).__init__(**kwargs)
            data_format = conv_utils.normalize_data_format(data_format)
            self.output_dim = conv_utils.normalize_tuple(output_dim, 2, 'output_dim')
            self.data_format = conv_utils.normalize_data_format(data_format)
            self.input_spec = InputSpec(ndim=4)
    
        def build(self, input_shape):
            self.input_spec = [InputSpec(shape=input_shape)]
    
        def compute_output_shape(self, input_shape):
            if self.data_format == 'channels_first':
                return (input_shape[0], input_shape[1], self.output_dim[0], self.output_dim[1])
            elif self.data_format == 'channels_last':
                return (input_shape[0], self.output_dim[0], self.output_dim[1], input_shape[3])
    
        def _resize_fun(self, inputs, data_format):
            try:
                assert keras.backend.backend() == 'tensorflow'
                assert self.data_format == 'channels_last'
            except AssertionError:
                print "Only tensorflow backend is supported for the resize layer and accordingly 'channels_last' ordering"
            output = tfi.resize_images(inputs, self.output_dim)
            return output
    
        def call(self, inputs):
            output = self._resize_fun(inputs=inputs, data_format=self.data_format)
            return output
    
        def get_config(self):
            config = {'output_dim': self.output_dim,
                      'padding': self.padding,
                      'data_format': self.data_format}
            base_config = super(ResizeImages, self).get_config()
            return dict(list(base_config.items()) + list(config.items()))
    
    0 讨论(0)
  • 2020-12-17 18:49

    To resize the given input image to a target size(in this case 224x224x3):

    Use Lambda layer in conventional Keras:

    from keras.backend import tf as ktf
    
    inp = Input(shape=(None, None, 3))
    

    [Ref: https://www.tensorflow.org/api_docs/python/tf/keras/backend/resize_images] :

    0 讨论(0)
  • 2020-12-17 18:52

    A modification of @KeithWM 's answer, adding output_scale, e.g. output_scale=2 means the output is 2 times the input shape :)

    class ResizeImages(Layer):
        """Resize Images to a specified size
        https://stackoverflow.com/questions/41903928/add-a-resizing-layer-to-a-keras-sequential-model
    
        # Arguments
            output_dim: Size of output layer width and height
            output_scale: scale compared with input
            data_format: A string,
                one of `channels_last` (default) or `channels_first`.
                The ordering of the dimensions in the inputs.
                `channels_last` corresponds to inputs with shape
                `(batch, height, width, channels)` while `channels_first`
                corresponds to inputs with shape
                `(batch, channels, height, width)`.
                It defaults to the `image_data_format` value found in your
                Keras config file at `~/.keras/keras.json`.
                If you never set it, then it will be "channels_last".
    
        # Input shape
            - If `data_format='channels_last'`:
                4D tensor with shape:
                `(batch_size, rows, cols, channels)`
            - If `data_format='channels_first'`:
                4D tensor with shape:
                `(batch_size, channels, rows, cols)`
    
        # Output shape
            - If `data_format='channels_last'`:
                4D tensor with shape:
                `(batch_size, pooled_rows, pooled_cols, channels)`
            - If `data_format='channels_first'`:
                4D tensor with shape:
                `(batch_size, channels, pooled_rows, pooled_cols)`
        """
    
        def __init__(self, output_dim=(1, 1), output_scale=None, data_format=None, **kwargs):
            super(ResizeImages, self).__init__(**kwargs)
            data_format = normalize_data_format(data_format)  # does not have
            self.naive_output_dim = conv_utils.normalize_tuple(output_dim,
                                                               2, 'output_dim')
            self.naive_output_scale = output_scale
            self.data_format = normalize_data_format(data_format)
            self.input_spec = InputSpec(ndim=4)
    
        def build(self, input_shape):
            self.input_spec = [InputSpec(shape=input_shape)]
            if self.naive_output_scale is not None:
                if self.data_format == 'channels_first':
                    self.output_dim = (self.naive_output_scale * input_shape[2],
                                       self.naive_output_scale * input_shape[3])
                elif self.data_format == 'channels_last':
                    self.output_dim = (self.naive_output_scale * input_shape[1],
                                       self.naive_output_scale * input_shape[2])
            else:
                self.output_dim = self.naive_output_dim
    
        def compute_output_shape(self, input_shape):
            if self.data_format == 'channels_first':
                return (input_shape[0], input_shape[1], self.output_dim[0], self.output_dim[1])
            elif self.data_format == 'channels_last':
                return (input_shape[0], self.output_dim[0], self.output_dim[1], input_shape[3])
    
        def _resize_fun(self, inputs, data_format):
            try:
                assert keras.backend.backend() == 'tensorflow'
                assert self.data_format == 'channels_last'
            except AssertionError:
                print("Only tensorflow backend is supported for the resize layer and accordingly 'channels_last' ordering")
            output = tf.image.resize_images(inputs, self.output_dim)
            return output
    
        def call(self, inputs):
            output = self._resize_fun(inputs=inputs, data_format=self.data_format)
            return output
    
        def get_config(self):
            config = {'output_dim': self.output_dim,
                      'padding': self.padding,
                      'data_format': self.data_format}
            base_config = super(ResizeImages, self).get_config()
            return dict(list(base_config.items()) + list(config.items()))
    
    0 讨论(0)
  • 2020-12-17 18:54

    The accepted answer uses the Reshape layer, which works like NumPy's reshape, which can be used to reshape a 4x4 matrix into a 2x8 matrix, but that will result in the image loosing locality information:

    0 0 0 0
    1 1 1 1    ->    0 0 0 0 1 1 1 1
    2 2 2 2          2 2 2 2 3 3 3 3
    3 3 3 3
    

    Instead, image data should be rescaled / "resized" using, e.g., Tensorflows image_resize. But beware about the correct usage and the bugs! As shown in the related question, this can be used with a lambda layer:

    model.add( keras.layers.Lambda( 
        lambda image: tf.image.resize_images( 
            image, 
            (224, 224), 
            method = tf.image.ResizeMethod.BICUBIC,
            align_corners = True, # possibly important
            preserve_aspect_ratio = True
        )
    ))
    

    In your case, as you have a 160x320 image, you also have to decide whether to keep the aspect ratio, or not. If you want to use a pre-trained network, then you should use the same kind of resizing that the network was trained for.

    0 讨论(0)
  • 2020-12-17 18:55

    Normally you would use the Reshape layer for this:

    model.add(Reshape((224,224,3), input_shape=(160,320,3))
    

    but since your target dimensions don't allow to hold all the data from the input dimensions (224*224 != 160*320), this won't work. You can only use Reshape if the number of elements does not change.

    If you are fine with losing some data in your image, you can specify your own lossy reshape:

    model.add(Reshape(-1,3), input_shape=(160,320,3))
    model.add(Lambda(lambda x: x[:50176])) # throw away some, so that #data = 224^2
    model.add(Reshape(224,224,3))
    

    That said, often these transforms are done before applying the data to the model because this is essentially wasted computation time if done in every training step.

    0 讨论(0)
提交回复
热议问题