Tensorflow 3 channel order of color inputs

后端 未结 1 1954
情歌与酒
情歌与酒 2021-01-20 01:15

I\'m using tensor flow to process color images with a convolutional neural network. A code snippet is below.

My code runs so I think I got the number of channels rig

相关标签:
1条回答
  • 2021-01-20 01:47

    TL;DR: With your current program, the in-memory layout of the data should be should be R-G-B-R-G-B-R-G-B-R-G-B...

    I assume from this line that you are passing in RGB images with 28x28 pixels:

    self.x_image = tf.reshape(self.c_x, [-1, 28, 28, 3])
    

    We can call the dimensions of self.x_image are "batch", "height", "width", and "channel". This matches the default data format for tf.nn.conv_2d() and tf.nn.max_pool().

    In TensorFlow, the in-memory representation of a tensor is row-major order (or "C" ordering, because that is the representation of arrays in the C programming language). Essentially this means that the rightmost dimension is the fastest changing, and the elements of the tensor are packed together in memory in the following order (where ? stands for the unknown batch size, minus 1):

    [0,  0,  0,  0]
    [0,  0,  0,  1]
    [0,  0,  0,  2]
    [0,  0,  1,  0]
    ...
    [?, 27, 27,  1]
    [?, 27, 27,  2]
    

    Therefore your program probably isn't interpreting the image data correctly. There are at least two options:

    1. Reshape your data to match its true order ("batch", "channels", "height", "width"):

      self.x_image = tf.reshape(self.c_x, [-1, 3, 28, 28])
      

      In fact, this format is sometimes more efficient for convolutions. You can instruct tf.nn.conv2d() and tf.nn.max_pool() to use it without transposing by passing the optional argument data_format="NCHW", but you will also need to change the shape of your bias variables to match.

    2. Transpose your image data to match the result of your program using tf.transpose():

      self.x_image = tf.transpose(tf.reshape(self.c_x, [-1, 3, 28, 28]), [0, 2, 3, 1])
      
    0 讨论(0)
提交回复
热议问题