conv-neural-network | 易学教程

Receptive feild size and object size in deep learning

阅读更多关于 Receptive feild size and object size in deep learning

问题 I can calculate the receptive field size of 500 x 500 input image for VGGNet. The receptive field sizes are as follow. Layer Name = conv1, Output size = 500, Stride = 1, RF size = 3 Layer Name = relu1_1, Output size = 500, Stride = 1, RF size = 3 Layer Name = conv1_2, Output size = 500, Stride = 1, RF size = 5 Layer Name = relu1_2, Output size = 500, Stride = 1, RF size = 5 Layer Name = pool1, Output size = 250, Stride = 2, RF size = 6 Layer Name = conv2_1, Output size = 250, Stride = 2, RF

How to get features from several layers using c++ in caffe

阅读更多关于 How to get features from several layers using c++ in caffe

问题 How can I get both the 4096 dim feature layer and the 1000 dim class layer in caffe after one forward pass using C++? I tried to look it up in extract_features.cpp but it uses some weird datum object, so I cannot really understand how it works. So far I was simply cropping my prototxt files up to the layer that I wanted to extract and used [...] net->ForwardPrefilled(); Blob<float> *output_layer = net->output_blobs()[0]; const float *begin = output_layer->cpu_data(); const float *end = begin

ERROR (theano.gof.opt): Optimization failure due to: constant_folding

阅读更多关于 ERROR (theano.gof.opt): Optimization failure due to: constant_folding

问题 In Neural Networks and Deep Learning, there's an object called network3 (which is a PY file, written for python 2.7 and theano 0.7). I modified it to run with python 3.6 and theano 1.0.3. However, when I run the following code: import network3 from network3 import Network from network3 import ConvPoolLayer , FullyConnectedLayer , SoftmaxLayer training_data , validation_data , test_data = network3.load_data_shared() mini_batch_size = 10 net = Network([FullyConnectedLayer(n_in=784, n_out=100),

Confused during reshaping array of image

阅读更多关于 Confused during reshaping array of image

问题 At the moment I'm trying to run a ConvNet. Each image, which later feeds the neural net, is stored as a list. But the list is at the moment created using three for-loops. Have a look: im = Image.open(os.path.join(p_input_directory, item)) pix = im.load() image_representation = [] # Get image into byte array for color in range(0, 3): for x in range(0, 32): for y in range(0, 32): image_representation.append(pix[x, y][color]) I'm pretty sure that this is not the nicest and most efficient way.

Custom median pooling in tensorflow

阅读更多关于 Custom median pooling in tensorflow

问题 I am trying to implement a median pooling layer in tensorflow. However there is neither tf.nn.median_pool and neither tf.reduce_median . Is there a way to implement such pooling layer with the python api ? 回答1: You could use something like: patches = tf.extract_image_patches(tensor, [1, k, k, 1], ...) m_idx = int(k*k/2+1) top = tf.top_k(patches, m_idx, sorted=True) median = tf.slice(top, [0, 0, 0, m_idx-1], [-1, -1, -1, 1]) To accommodate even sized median kernels and multiple channels, you

error: The model expects 3 input arrays, but only received one array. Found: array with shape (10, 20, 50, 50, 1)

阅读更多关于 error: The model expects 3 input arrays, but only received one array. Found: array with shape (10, 20, 50, 50, 1)

问题 main_model = Sequential() main_model.add(Conv3D(32, 3, 3,3, input_shape=(20,50,50,1)))' main_model.add(Activation('relu')) main_model.add(MaxPooling3D(pool_size=(2, 2,2)) main_model.add(Conv3D(64, 3, 3,3)) main_model.add(Activation('relu')) main_model.add(MaxPooling3D(pool_size=(2, 2,2))) main_model.add(Dropout(0.8)) main_model.add(Flatten()) #lower features model - CNN2 lower_model1 = Sequential() lower_model1.add(Conv3D(32, 3, 3,3, input_shape=(20,50,50,1))) lower_model1.add(Activation(

Using squared difference of two images as loss function in tensorflow

阅读更多关于 Using squared difference of two images as loss function in tensorflow

问题 I'm trying to use the SSD between two images as loss function for my network. # h_fc2 is my output layer, y_ is my label image. ssd = tf.reduce_sum(tf.square(y_ - h_fc2)) train_step = tf.train.GradientDescentOptimizer(0.01).minimize(ssd) Problem is, that the weights then diverge and I get the error ReluGrad input is not finite. : Tensor had Inf values Why's that? I did try some other stuff like normalizing the ssd by the image size (did not work) or cropping the output values to 1 (does not

When to use what type of padding for convolution layers?

阅读更多关于 When to use what type of padding for convolution layers?

问题 I know when we are using convolution layers in a neural net we usually use padding and mainly constant padding(e.g. zero padding). And there are different kinds of padding(e.g. symmetric, reflective, constant). But I am not sure what are the advantages and disadvantages of using different padding methods and when to use which one. 回答1: it really depends on the situation for what the neural network is intended. I would not tell it pros and cons. This time the world cannot put into a binary

What is the replace for softmax layer in case more than one output can be activated?

阅读更多关于 What is the replace for softmax layer in case more than one output can be activated?

问题 For example, I have CNN which tries to predict numbers from MNIST dataset (code written using Keras). It has 10 outputs, which form softmax layer. Only one of outputs can be true (independently for each digit from 0 to 9): Real: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0] Predicted: [0.02, 0.9, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01] Sum of predicted is equal to 1.0 due to definition of softmax. Let's say I have a task where I need to classify some objects that can fall in several categories: Real

Labels in Caffe as Images

阅读更多关于 Labels in Caffe as Images

问题 I'm new to Caffe. I am trying to implement a Fully Convolution Neural Network (FCN-8s) for semantic segmentation. I have image data and label data, which are both images. This is for pixel-wise predictions. I tried using ImageData as the data type, but it asks for an integer label, which is not applicable to this scenario. Kindly advise as how to I can give Caffe a 2D label. Should I prefer LMDB instead of ImageData? If so, how do I proceed? I could not find any good tutorial/documentation