Basic 1d convolution in tensorflow

匿名 (未验证) 提交于 2019-12-03 01:25:01

问题:

OK, I'd like to do a 1-dimensional convolution of time series data in Tensorflow. This is apparently supported using tf.nn.conv2d, according to these tickets, and the manual. the only requirement is to set strides=[1,1,1,1]. Sounds simple!

However, I cannot work out how to do this in even a very minimal test case. What am I doing wrong?

Let's set this up.

import tensorflow as tf import numpy as np print(tf.__version__) >>> 0.9.0 

OK, now generate a basic convolution test on two small arrays. I will make it easy by using a batch size of 1, and since time series are 1-dimensional, I will have an "image height" of 1. And since it's a univariate time series, clearly the number of "channels" is also 1, so this will be simple, right?

g = tf.Graph() with g.as_default():     # data shape is "[batch, in_height, in_width, in_channels]",     x = tf.Variable(np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(1,1,-1,1), name="x")     # filter shape is "[filter_height, filter_width, in_channels, out_channels]"     phi = tf.Variable(np.array([0.0, 0.5, 1.0]).reshape(1,-1,1,1), name="phi")     conv = tf.nn.conv2d(         phi,         x,         strides=[1, 1, 1, 1],         padding="SAME",         name="conv") 

BOOM. Error.

ValueError: Dimensions 1 and 5 are not compatible 

OK, For a start, I don't understand how this should happen with any dimension, since I've specified that I'm padding the arguments in the convolution OP.

but fine, maybe there are limits to that. I must have got the documentation confused and set up this convolution on the wrong axes of the tensor. I'll try all possible permutations:

for i in range(4):     for j in range(4):         shape1 = [1,1,1,1]         shape1[i] = -1         shape2 = [1,1,1,1]         shape2[j] = -1         x_array = np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(*shape1)         phi_array = np.array([0.0, 0.5, 1.0]).reshape(*shape2)         try:             g = tf.Graph()             with g.as_default():                 x = tf.Variable(x_array, name="x")                 phi = tf.Variable(phi_array, name="phi")                 conv = tf.nn.conv2d(                     x,                     phi,                     strides=[1, 1, 1, 1],                     padding="SAME",                     name="conv")                 init_op = tf.initialize_all_variables()             sess = tf.Session(graph=g)             sess.run(init_op)             print("SUCCEEDED!", x_array.shape, phi_array.shape, conv.eval(session=sess))             sess.close()         except Exception as e:             print("FAILED!", x_array.shape, phi_array.shape, type(e), e.args or e._message) 

Result:

FAILED! (5, 1, 1, 1) (3, 1, 1, 1)  ('Filter must not be larger than the input: Filter: (3, 1) Input: (1, 1)',) FAILED! (5, 1, 1, 1) (1, 3, 1, 1)  ('Filter must not be larger than the input: Filter: (1, 3) Input: (1, 1)',) FAILED! (5, 1, 1, 1) (1, 1, 3, 1)  ('Dimensions 1 and 3 are not compatible',) FAILED! (5, 1, 1, 1) (1, 1, 1, 3)  No OpKernel was registered to support Op 'Conv2D' with these attrs      [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]] FAILED! (1, 5, 1, 1) (3, 1, 1, 1)  No OpKernel was registered to support Op 'Conv2D' with these attrs      [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]] FAILED! (1, 5, 1, 1) (1, 3, 1, 1)  ('Filter must not be larger than the input: Filter: (1, 3) Input: (5, 1)',) FAILED! (1, 5, 1, 1) (1, 1, 3, 1)  ('Dimensions 1 and 3 are not compatible',) FAILED! (1, 5, 1, 1) (1, 1, 1, 3)  No OpKernel was registered to support Op 'Conv2D' with these attrs      [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]] FAILED! (1, 1, 5, 1) (3, 1, 1, 1)  ('Filter must not be larger than the input: Filter: (3, 1) Input: (1, 5)',) FAILED! (1, 1, 5, 1) (1, 3, 1, 1)  No OpKernel was registered to support Op 'Conv2D' with these attrs      [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]] FAILED! (1, 1, 5, 1) (1, 1, 3, 1)  ('Dimensions 1 and 3 are not compatible',) FAILED! (1, 1, 5, 1) (1, 1, 1, 3)  No OpKernel was registered to support Op 'Conv2D' with these attrs      [[Node: conv = Conv2D[T=DT_DOUBLE, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](x/read, phi/read)]] FAILED! (1, 1, 1, 5) (3, 1, 1, 1)  ('Dimensions 5 and 1 are not compatible',) FAILED! (1, 1, 1, 5) (1, 3, 1, 1)  ('Dimensions 5 and 1 are not compatible',) FAILED! (1, 1, 1, 5) (1, 1, 3, 1)  ('Dimensions 5 and 3 are not compatible',) FAILED! (1, 1, 1, 5) (1, 1, 1, 3)  ('Dimensions 5 and 1 are not compatible',) 

Hmm. OK, it looks like there are two problems now. Firstly, the ValueError is about applying the filter along the wrong axis, I guess, although there are two forms.

But then the axes along which I can apply the filter are confusing too - notice that it actually constructs the graph with input shape (5, 1, 1, 1) and filter shape (1, 1, 1, 3). AFAICT from the documentation, this should be a filter that looks at on example from the batch, one "pixel" and one "channel" and outputs 3 "channels". Why does that one work, then, when others do not?

Anyway, sometimes it does not fail while constructing the graph. Sometime it constructs the graph; then we get the tensorflow.python.framework.errors.InvalidArgumentError. From some confusing github tickets I gather this is probably due to the fact that I'm running on CPU instead of GPU, or vice versa the fact that the convolution Op is only defined for 32 bit floats, not 64 bit floats. If anyone could throw some light on which axes I should be aligning what on, in order to convolve a time series with a kernel, I'd be very grateful.

回答1:

I am sorry to say that, but your first code was almost right. You just inverted x and phi in tf.nn.conv2d:

g = tf.Graph() with g.as_default():     # data shape is "[batch, in_height, in_width, in_channels]",     x = tf.Variable(np.array([0.0, 0.0, 0.0, 0.0, 1.0]).reshape(1, 1, 5, 1), name="x")     # filter shape is "[filter_height, filter_width, in_channels, out_channels]"     phi = tf.Variable(np.array([0.0, 0.5, 1.0]).reshape(1, 3, 1, 1), name="phi")     conv = tf.nn.conv2d(         x,         phi,         strides=[1, 1, 1, 1],         padding="SAME",         name="conv") 

Update: TensorFlow now supports 1D convolution since version r0.11, using tf.nn.conv1d. I previously made a guide to use them in the stackoverflow documentation (now extinct) that I'm pasting here:


Guide to 1D convolution

Consider a basic example with an input of length 10, and dimension 16. The batch size is 32. We therefore have a placeholder with input shape [batch_size, 10, 16].

batch_size = 32 x = tf.placeholder(tf.float32, [batch_size, 10, 16]) 

We then create a filter with width 3, and we take 16 channels as input, and output also 16 channels.

filter = tf.zeros([3, 16, 16])  # these should be real values, not 0 

Finally we apply tf.nn.conv1d with a stride and a padding: - stride: integer s - padding: this works like in 2D, you can choose between SAME and VALID. SAME will output the same input length, while VALID will not add zero padding.

For our example we take a stride of 2, and a valid padding.

output = tf.nn.conv1d(x, filter, stride=2, padding="VALID") 

The output shape should be [batch_size, 4, 16].
With padding="SAME", we would have had an output shape of [batch_size, 5, 16].



回答2:

I think I got it to work with the requirements that I needed. The comments/details of how it works are on the code:

import numpy as np  import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data  task_name = 'task_MNIST_flat_auto_encoder' mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) X_train, Y_train = mnist.train.images, mnist.train.labels # N x D X_cv, Y_cv = mnist.validation.images, mnist.validation.labels X_test, Y_test = mnist.test.images, mnist.test.labels  # data shape is "[batch, in_height, in_width, in_channels]", # X_train = N x D N, D = X_train.shape # think of it as N images with height 1 and width D. X_train = X_train.reshape(N,1,D,1) x = tf.placeholder(tf.float32, shape=[None,1,D,1], name='x-input') #x = tf.Variable( X_train , name='x-input') # filter shape is "[filter_height, filter_width, in_channels, out_channels]" filter_size, nb_filters = 10, 12 # filter_size , number of hidden units/units # think of it as having nb_filters number of filters, each of size filter_size W = tf.Variable( tf.truncated_normal(shape=[1, filter_size, 1,nb_filters], stddev=0.1) ) stride_convd1 = 2 # controls the stride for 1D convolution conv = tf.nn.conv2d(input=x, filter=W, strides=[1, 1, stride_convd1, 1], padding="SAME", name="conv")  with tf.Session() as sess:     sess.run( tf.initialize_all_variables() )     sess.run(fetches=conv, feed_dict={x:X_train}) 

thanks to Olivier for the help (see the discussion in his comments for further clarification).


Manually check it:

X_train_org = np.array([[0,1,2,3]]) N, D = X_train_org.shape X_train_1d = X_train_org.reshape(N,1,D,1) #X_train = tf.constant( X_train_org ) # think of it as N images with height 1 and width D. xx = tf.placeholder(tf.float32, shape=[None,1,D,1], name='xx-input') #x = tf.Variable( X_train , name='x-input') # filter shape is "[filter_height, filter_width, in_channels, out_channels]" filter_size, nb_filters = 2, 2 # filter_size , number of hidden units/units # think of it as having nb_filters number of filters, each of size filter_size filter_w = np.array([[1,3],[2,4]]).reshape(1,filter_size,1,nb_filters) #W = tf.Variable( tf.truncated_normal(shape=[1,filter_size,1,nb_filters], stddev=0.1) ) W = tf.Variable( tf.constant(filter_w, dtype=tf.float32) ) stride_convd1 = 2 # controls the stride for 1D convolution conv = tf.nn.conv2d(input=xx, filter=W, strides=[1, 1, stride_convd1, 1], padding="SAME", name="conv")  #C = tf.constant( (np.array([[4,3,2,1]]).T).reshape(1,1,1,4) , dtype=tf.float32 ) # #tf.reshape( conv , []) #y_tf = tf.matmul(conv, C)   ## x = tf.placeholder(tf.float32, shape=[None,D], name='x-input') # N x 4 W1 = tf.Variable( tf.constant( np.array([[1,2,0,0],[3,4,0,0]]).T, dtype=tf.float32 ) ) # 2 x 4 y1 = tf.matmul(x,W1) # N x 2 = N x 4 x 4 x 2 W2 = tf.Variable( tf.constant( np.array([[0,0,1,2],[0,0,3,4]]).T, dtype=tf.float32 )) y2 = tf.matmul(x,W2) # N x 2 = N x 4 x 4 x 2 C1 = tf.constant( np.array([[4,3]]).T, dtype=tf.float32 ) # 1 x 2 C2 = tf.constant( np.array([[2,1]]).T, dtype=tf.float32 )  p1 = tf.matmul(y1,C1) p2 = tf.matmul(y2,C2) y = p1 + p2 with tf.Session() as sess:     sess.run( tf.initialize_all_variables() )     print 'manual conv'     print sess.run(fetches=y1, feed_dict={x:X_train_org})     print sess.run(fetches=y2, feed_dict={x:X_train_org})     #print sess.run(fetches=y, feed_dict={x:X_train_org})     print 'tf conv'     print sess.run(fetches=conv, feed_dict={xx:X_train_1d})     #print sess.run(fetches=y_tf, feed_dict={xx:X_train_1d}) 

outputs:

manual conv [[ 2.  4.]] [[  8.  18.]] tf conv [[[[  2.   4.]    [  8.  18.]]]] 


回答3:

In the new versions of TF (starting from 0.11) you have conv1d, so there is no need to use 2d convolution to do 1d convolution. Here is a simple example of how to use conv1d:

import tensorflow as tf i = tf.constant([1, 0, 2, 3, 0, 1, 1], dtype=tf.float32, name='i') k = tf.constant([2, 1, 3], dtype=tf.float32, name='k')  data   = tf.reshape(i, [1, int(i.shape[0]), 1], name='data') kernel = tf.reshape(k, [int(k.shape[0]), 1, 1], name='kernel')  res = tf.squeeze(tf.nn.conv1d(data, kernel, 1, 'VALID')) with tf.Session() as sess:     print sess.run(res) 

To understand how conv1d is calculates, take a look at various examples



易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!