问题
I have the following parameters defined for doing a max pool over the depth of the image (rgb) for compression before the dense layer and readout...and I am failing with an error that I cannot pool over depth and everything else:
sunset_poolmax_1x1x3_div_2x2x3_params = \
{'pool_function':tf.nn.max_pool,
'ksize':[1,1,1,3],
'strides':[1,1,1,3],
'padding': 'SAME'}
I changed the strides to [1,1,1,3] so that depth is the only dimension reduced by the pool...but it still doesn't work. I can't get good results with the tiny image I have to compress everything to in order to keep the colors...
Actual Error:
ValueError: Current implementation does not support pooling in the batch and depth dimensions.
回答1:
tf.nn.max_pool does not support pooling over the depth dimension which is why you get an error.
You can use a max reduction instead to achieve what you're looking for:
tf.reduce_max(input_tensor, reduction_indices=[3], keep_dims=True)
The keep_dims parameter above ensures that the rank of the tensor is preserved. This ensures that the behavior of the max reduction will be consistent with what the tf.nn.max_pool operation would do if it supported pooling over the depth dimension.
回答2:
TensorFlow now supports depth-wise max pooling with tf.nn.max_pool(). For example, here is how to implement it using pooling kernel size 3, stride 3 and VALID padding:
import tensorflow as tf
output = tf.nn.max_pool(images,
ksize=(1, 1, 1, 3),
strides=(1, 1, 1, 3),
padding="VALID")
You can use this in a Keras model by wrapping it in a Lambda layer:
from tensorflow import keras
depth_pool = keras.layers.Lambda(
lambda X: tf.nn.max_pool(X,
ksize=(1, 1, 1, 3),
strides=(1, 1, 1, 3),
padding="VALID"))
model = keras.models.Sequential([
..., # other layers
depth_pool,
... # other layers
])
Alternatively, you can write a custom Keras layer:
class DepthMaxPool(keras.layers.Layer):
def __init__(self, pool_size, strides=None, padding="VALID", **kwargs):
super().__init__(**kwargs)
if strides is None:
strides = pool_size
self.pool_size = pool_size
self.strides = strides
self.padding = padding
def call(self, inputs):
return tf.nn.max_pool(inputs,
ksize=(1, 1, 1, self.pool_size),
strides=(1, 1, 1, self.pool_size),
padding=self.padding)
You can then use it like any other layer:
model = keras.models.Sequential([
..., # other layers
DepthMaxPool(3),
... # other layers
])
回答3:
Here is a brief example to the original question for tensorflow. I tested it on a stock RGB image of size 225 x 225 with 3 channels.
Import the standard libraries, enable eager_execution to quickly view results
import tensorflow as tf
from scipy.misc import imread
import matplotlib.pyplot as plt
import numpy as np
tf.enable_eager_execution()
Read image and cast from uint8 to tf.float32
x = tf.cast(imread('tiger.jpeg'), tf.float32)
x = tf.reshape(x, shape=[-1, x.shape[0], x.shape[1], x.shape[2]])
print(x.shape)
input_channels = x.shape[3]
Create the filter for depthwise convolution
filters = tf.contrib.eager.Variable(tf.random_normal(shape=[3, 3, input_channels, 4]))
print(x.shape)
Perform depthwise convolution with channel multiplier 4. Note the the padding has been kept to 'SAME'. It can be changed at will.
x = tf.nn.depthwise_conv2d(input=x, filter=filters, strides=[1, 1, 1, 1], padding='SAME', name='conv_1')
print(x.shape)
Perform the max_pooling2d. Since the output of the pooling layer is (input_size - pool_size + 2 * padding)/stride + 1 and the padding is 'valid', we should get an output of (225 - 2 + 0)/1 + 1 = 223.
x = tf.layers.max_pooling2d(inputs=x, pool_size=2, strides=1,padding='valid', name='maxpool1')
print(x.shape)
Plot the figures to confirm.
fig, ax = plt.subplots(nrows=4, ncols=3)
q = 0
for ii in range(4):
for jj in range(3):
ax[ii, jj].imshow(np.squeeze(x[:,:,:,q]))
ax[ii,jj].set_axis_off()
q += 1
plt.tight_layout()
plt.show()
来源:https://stackoverflow.com/questions/36817868/tensorflow-how-to-pool-over-depth