Tensorflow: How to Pool over Depth?

问题

I have the following parameters defined for doing a max pool over the depth of the image (rgb) for compression before the dense layer and readout...and I am failing with an error that I cannot pool over depth and everything else:

sunset_poolmax_1x1x3_div_2x2x3_params = \
    {'pool_function':tf.nn.max_pool,
     'ksize':[1,1,1,3],
     'strides':[1,1,1,3],
     'padding': 'SAME'}

I changed the strides to [1,1,1,3] so that depth is the only dimension reduced by the pool...but it still doesn't work. I can't get good results with the tiny image I have to compress everything to in order to keep the colors...

Actual Error:

ValueError: Current implementation does not support pooling in the batch and depth dimensions.

回答1:

tf.nn.max_pool does not support pooling over the depth dimension which is why you get an error.

You can use a max reduction instead to achieve what you're looking for:

tf.reduce_max(input_tensor, reduction_indices=[3], keep_dims=True)

The keep_dims parameter above ensures that the rank of the tensor is preserved. This ensures that the behavior of the max reduction will be consistent with what the tf.nn.max_pool operation would do if it supported pooling over the depth dimension.

回答2:

TensorFlow now supports depth-wise max pooling with tf.nn.max_pool(). For example, here is how to implement it using pooling kernel size 3, stride 3 and VALID padding:

import tensorflow as tf

output = tf.nn.max_pool(images,
                        ksize=(1, 1, 1, 3),
                        strides=(1, 1, 1, 3),
                        padding="VALID")

You can use this in a Keras model by wrapping it in a Lambda layer:

from tensorflow import keras

depth_pool = keras.layers.Lambda(
    lambda X: tf.nn.max_pool(X,
                             ksize=(1, 1, 1, 3),
                             strides=(1, 1, 1, 3),
                             padding="VALID"))

model = keras.models.Sequential([
    ..., # other layers
    depth_pool,
    ... # other layers
])

Alternatively, you can write a custom Keras layer:

class DepthMaxPool(keras.layers.Layer):
    def __init__(self, pool_size, strides=None, padding="VALID", **kwargs):
        super().__init__(**kwargs)
        if strides is None:
            strides = pool_size
        self.pool_size = pool_size
        self.strides = strides
        self.padding = padding
    def call(self, inputs):
        return tf.nn.max_pool(inputs,
                              ksize=(1, 1, 1, self.pool_size),
                              strides=(1, 1, 1, self.pool_size),
                              padding=self.padding)

You can then use it like any other layer:

model = keras.models.Sequential([
    ..., # other layers
    DepthMaxPool(3),
    ... # other layers
])

回答3:

Here is a brief example to the original question for tensorflow. I tested it on a stock RGB image of size 225 x 225 with 3 channels.

Import the standard libraries, enable eager_execution to quickly view results

import tensorflow as tf
from scipy.misc import imread
import matplotlib.pyplot as plt
import numpy as np
tf.enable_eager_execution()

Read image and cast from uint8 to tf.float32

x = tf.cast(imread('tiger.jpeg'), tf.float32)
x = tf.reshape(x, shape=[-1, x.shape[0], x.shape[1], x.shape[2]])
print(x.shape)
input_channels = x.shape[3]

Create the filter for depthwise convolution

filters = tf.contrib.eager.Variable(tf.random_normal(shape=[3, 3, input_channels, 4]))
print(x.shape)

Perform depthwise convolution with channel multiplier 4. Note the the padding has been kept to 'SAME'. It can be changed at will.

x = tf.nn.depthwise_conv2d(input=x, filter=filters, strides=[1, 1, 1, 1], padding='SAME', name='conv_1')
print(x.shape)

Perform the max_pooling2d. Since the output of the pooling layer is (input_size - pool_size + 2 * padding)/stride + 1 and the padding is 'valid', we should get an output of (225 - 2 + 0)/1 + 1 = 223.

x = tf.layers.max_pooling2d(inputs=x, pool_size=2, strides=1,padding='valid', name='maxpool1')
print(x.shape)

Plot the figures to confirm.

fig, ax = plt.subplots(nrows=4, ncols=3)
q = 0
for ii in range(4):
    for jj in range(3):
        ax[ii, jj].imshow(np.squeeze(x[:,:,:,q]))
        ax[ii,jj].set_axis_off()
        q += 1
plt.tight_layout()
plt.show()

来源：https://stackoverflow.com/questions/36817868/tensorflow-how-to-pool-over-depth

标签

python

tensorflow