Tensorflow's asymmetric padding assumptions

问题

Why has TensorFlow chosen to prefer padding on the bottom right?

With SAME padding, to me it would feel logical to start the kernel's center anchor at the first real pixel. Due to the use of asymmetric padding, this results in a discrepancy with some other frameworks. I do understand that asymmetric padding in principle is good because otherwise one would be left with an unused padding row/column.

If TensorFlow would have given presedence to padding on the left and top, it would do convolutions and weights the same as Caffe/cudnn/$frameworks, and weight conversion would be compatible regardless of padding.

Code:

import numpy as np
import tensorflow as tf
import torch
import torch.nn as nn

tf.enable_eager_execution()

def conv1d_tf(data, kernel_weights, stride):
    filters = np.reshape(kernel_weights, [len(kernel_weights), 1, 1])
    out = tf.nn.conv1d(
        value=data,
        filters=filters,
        stride=stride,
        padding='SAME',
        data_format='NCW',
        )
    return out


def conv1d_pytorch(data, kernel_weights, stride):
    filters = np.reshape(kernel_weights, [1, 1, len(kernel_weights)])
    kernel_size = len(kernel_weights)
    size = data.shape[-1]
    def same_padding(size, kernel_size, stride, dilation):
        padding = ((size - 1) * (stride - 1) + dilation * (kernel_size - 1)) //2
        return padding
    padding = same_padding(size=size, kernel_size=kernel_size, stride=stride, dilation=0)
    conv = nn.Conv1d(
        in_channels=1,
        out_channels=1,
        kernel_size=kernel_size,
        stride=stride,
        bias=False,
        padding=padding,
        )
    conv.weight = torch.nn.Parameter(torch.from_numpy(filters))
    return conv(torch.from_numpy(data))


data = np.array([[[1, 2, 3, 4]]], dtype=np.float32)
kernel_weights = np.array([0, 1], dtype=np.float32)
stride = 2

out_tf = conv1d_tf(data=data, kernel_weights=kernel_weights, stride=stride)
out_pytorch = conv1d_pytorch(data=data, kernel_weights=kernel_weights, stride=stride)

print('TensorFlow: %s' % out_tf)
print('pyTorch: %s' % out_pytorch)

Output:

TensorFlow: tf.Tensor([[[2. 4.]]], shape=(1, 1, 2), dtype=float32)
pyTorch: tensor([[[1., 3.]]], grad_fn=<SqueezeBackward1>)

回答1:

This is for historical compatibility reasons with previous (non-public) frameworks. It is unfortunate that the definitions aren't clearer, since it's a common stumbling block when porting between different libraries.

来源：https://stackoverflow.com/questions/42924324/tensorflows-asymmetric-padding-assumptions

标签

tensorflow

padding

caffe

convolution