Add a resizing layer to a keras sequential model

回眸只為那壹抹淺笑 提交于 2019-12-04 02:02:44

Normally you would use the Reshape layer for this:

model.add(Reshape((224,224,3), input_shape=(160,320,3))

but since your target dimensions don't allow to hold all the data from the input dimensions (224*224 != 160*320), this won't work. You can only use Reshape if the number of elements does not change.

If you are fine with losing some data in your image, you can specify your own lossy reshape:

model.add(Reshape(-1,3), input_shape=(160,320,3))
model.add(Lambda(lambda x: x[:50176])) # throw away some, so that #data = 224^2
model.add(Reshape(224,224,3))

That said, often these transforms are done before applying the data to the model because this is essentially wasted computation time if done in every training step.

I think you should consider using tensorflow's resize_images layer.

https://www.tensorflow.org/api_docs/python/tf/image/resize_images

It appears keras does not include this, and perhaps because the feature does not exist in theano. I have written a custom keras layer that does the same. It's a quick hack, so it might not work well in your case.

import keras
import keras.backend as K
from keras.utils import conv_utils
from keras.engine import InputSpec
from keras.engine import Layer
from tensorflow import image as tfi

class ResizeImages(Layer):
    """Resize Images to a specified size

    # Arguments
        output_size: Size of output layer width and height
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".

    # Input shape
        - If `data_format='channels_last'`:
            4D tensor with shape:
            `(batch_size, rows, cols, channels)`
        - If `data_format='channels_first'`:
            4D tensor with shape:
            `(batch_size, channels, rows, cols)`

    # Output shape
        - If `data_format='channels_last'`:
            4D tensor with shape:
            `(batch_size, pooled_rows, pooled_cols, channels)`
        - If `data_format='channels_first'`:
            4D tensor with shape:
            `(batch_size, channels, pooled_rows, pooled_cols)`
    """
    def __init__(self, output_dim=(1, 1), data_format=None, **kwargs):
        super(ResizeImages, self).__init__(**kwargs)
        data_format = conv_utils.normalize_data_format(data_format)
        self.output_dim = conv_utils.normalize_tuple(output_dim, 2, 'output_dim')
        self.data_format = conv_utils.normalize_data_format(data_format)
        self.input_spec = InputSpec(ndim=4)

    def build(self, input_shape):
        self.input_spec = [InputSpec(shape=input_shape)]

    def compute_output_shape(self, input_shape):
        if self.data_format == 'channels_first':
            return (input_shape[0], input_shape[1], self.output_dim[0], self.output_dim[1])
        elif self.data_format == 'channels_last':
            return (input_shape[0], self.output_dim[0], self.output_dim[1], input_shape[3])

    def _resize_fun(self, inputs, data_format):
        try:
            assert keras.backend.backend() == 'tensorflow'
            assert self.data_format == 'channels_last'
        except AssertionError:
            print "Only tensorflow backend is supported for the resize layer and accordingly 'channels_last' ordering"
        output = tfi.resize_images(inputs, self.output_dim)
        return output

    def call(self, inputs):
        output = self._resize_fun(inputs=inputs, data_format=self.data_format)
        return output

    def get_config(self):
        config = {'output_dim': self.output_dim,
                  'padding': self.padding,
                  'data_format': self.data_format}
        base_config = super(ResizeImages, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))
mxmlnkn

The accepted answer uses the Reshape layer, which works like NumPy's reshape, which can be used to reshape a 4x4 matrix into a 2x8 matrix, but that will result in the image loosing locality information:

0 0 0 0
1 1 1 1    ->    0 0 0 0 1 1 1 1
2 2 2 2          2 2 2 2 3 3 3 3
3 3 3 3

Instead, image data should be rescaled / "resized" using, e.g., Tensorflows image_resize. But beware about the correct usage and the bugs! As shown in the related question, this can be used with a lambda layer:

model.add( keras.layers.Lambda( 
    lambda image: tf.image.resize_images( 
        image, 
        (224, 224), 
        method = tf.image.ResizeMethod.BICUBIC,
        align_corners = True, # possibly important
        preserve_aspect_ratio = True
    )
))

In your case, as you have a 160x320 image, you also have to decide whether to keep the aspect ratio, or not. If you want to use a pre-trained network, then you should use the same kind of resizing that the network was trained for.

A modification of @KeithWM 's answer, adding output_scale, e.g. output_scale=2 means the output is 2 times the input shape :)

class ResizeImages(Layer):
    """Resize Images to a specified size
    https://stackoverflow.com/questions/41903928/add-a-resizing-layer-to-a-keras-sequential-model

    # Arguments
        output_dim: Size of output layer width and height
        output_scale: scale compared with input
        data_format: A string,
            one of `channels_last` (default) or `channels_first`.
            The ordering of the dimensions in the inputs.
            `channels_last` corresponds to inputs with shape
            `(batch, height, width, channels)` while `channels_first`
            corresponds to inputs with shape
            `(batch, channels, height, width)`.
            It defaults to the `image_data_format` value found in your
            Keras config file at `~/.keras/keras.json`.
            If you never set it, then it will be "channels_last".

    # Input shape
        - If `data_format='channels_last'`:
            4D tensor with shape:
            `(batch_size, rows, cols, channels)`
        - If `data_format='channels_first'`:
            4D tensor with shape:
            `(batch_size, channels, rows, cols)`

    # Output shape
        - If `data_format='channels_last'`:
            4D tensor with shape:
            `(batch_size, pooled_rows, pooled_cols, channels)`
        - If `data_format='channels_first'`:
            4D tensor with shape:
            `(batch_size, channels, pooled_rows, pooled_cols)`
    """

    def __init__(self, output_dim=(1, 1), output_scale=None, data_format=None, **kwargs):
        super(ResizeImages, self).__init__(**kwargs)
        data_format = normalize_data_format(data_format)  # does not have
        self.naive_output_dim = conv_utils.normalize_tuple(output_dim,
                                                           2, 'output_dim')
        self.naive_output_scale = output_scale
        self.data_format = normalize_data_format(data_format)
        self.input_spec = InputSpec(ndim=4)

    def build(self, input_shape):
        self.input_spec = [InputSpec(shape=input_shape)]
        if self.naive_output_scale is not None:
            if self.data_format == 'channels_first':
                self.output_dim = (self.naive_output_scale * input_shape[2],
                                   self.naive_output_scale * input_shape[3])
            elif self.data_format == 'channels_last':
                self.output_dim = (self.naive_output_scale * input_shape[1],
                                   self.naive_output_scale * input_shape[2])
        else:
            self.output_dim = self.naive_output_dim

    def compute_output_shape(self, input_shape):
        if self.data_format == 'channels_first':
            return (input_shape[0], input_shape[1], self.output_dim[0], self.output_dim[1])
        elif self.data_format == 'channels_last':
            return (input_shape[0], self.output_dim[0], self.output_dim[1], input_shape[3])

    def _resize_fun(self, inputs, data_format):
        try:
            assert keras.backend.backend() == 'tensorflow'
            assert self.data_format == 'channels_last'
        except AssertionError:
            print("Only tensorflow backend is supported for the resize layer and accordingly 'channels_last' ordering")
        output = tf.image.resize_images(inputs, self.output_dim)
        return output

    def call(self, inputs):
        output = self._resize_fun(inputs=inputs, data_format=self.data_format)
        return output

    def get_config(self):
        config = {'output_dim': self.output_dim,
                  'padding': self.padding,
                  'data_format': self.data_format}
        base_config = super(ResizeImages, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!