neural-network | 易学教程

Strange convergence in simple Neural Network

阅读更多关于 Strange convergence in simple Neural Network

问题 I've been struggling for some time with building a simplistic NN in Java. I've been working on and off on this project for a few months and I wanna finish it. My main issue is that I dunno how to implement backpropagation correctly (all sources use Python, math jargon, or explain the idea too briefly). Today I tried deducing the ideology by myself and the rule that I'm using is: the weight update = error * sigmoidDerivative(error) * weight itself; error = output - actual; (last layer) error =

A formula to find the size of a matrix after convolution

阅读更多关于 A formula to find the size of a matrix after convolution

问题 If my input size is 5x5, the stride is 1x1, and the filter size is 3x3 then I can compute on paper that the final size of the convolved matrix will be 3x3. But, when this input size changes to 28x28, or 50x50 then how can I compute the size of the convolved matrix on paper? Is there any formula or any trick to do that? 回答1: Yes, there's a formula (see the details in cs231n class): W2 = (W1 - F + 2*P) / S + 1 H2 = (H1 - F + 2*P) / S + 1 where W1xH1 is the original image size, F is the filter

Crop size Error in caffe Model

阅读更多关于 Crop size Error in caffe Model

问题 Im trying to train a caffe Model.I get this error I0806 09:41:02.010442 2992 sgd_solver.cpp:105] Iteration 360, lr = 9.76e- 05 F0806 09:41:20.544955 2998 data_transformer.cpp:168] Check failed: height<=datum_height (224 vs. 199) *** Check failure stack trace: *** @ 0x7f82b051edaa (unknown) @ 0x7f82b051ece4 (unknown) @ 0x7f82b051e6e6 (unknown) @ 0x7f82b0521687 (unknown) @ 0x7f82b0b8e9e0 caffe::DataTransformer<>::Transform() @ 0x7f82b0c09a2f caffe::DataLayer<>::load_batch() @ 0x7f82b0c9aa5

Convolutional neural network outputting equal probabilities for all labels

阅读更多关于 Convolutional neural network outputting equal probabilities for all labels

问题 I am currently training a CNN on MNIST, and the output probabilities (softmax) are giving [0.1,0.1,...,0.1] as training goes on. The initial values aren't uniform, so I can't figure out if I'm doing something stupid here? I'm only training for 15 steps, just to see how training progresses; even though that's a low number, I don't think that should result in uniform predictions? import numpy as np import tensorflow as tf import imageio from sklearn.datasets import fetch_mldata mnist = fetch

How do I import a data file as a matrix and run a .m file from a python script?

阅读更多关于 How do I import a data file as a matrix and run a .m file from a python script?

问题 I have a .m file that is used to run a neural network in matlab, which I have locally installed on my computer. I am trying to write a python script that will loop through a list of possible transfer and training functions for the neural network multiple times. I've written a function to open and edit the .m file, but I don't know how to; 1. run the .m file from the python script 2. import the necessary data for the neural network as a space delimited matrix. I have three data files that need

Generator “TypeError: 'generator' object is not an iterator”

阅读更多关于 Generator “TypeError: 'generator' object is not an iterator”

问题 Due to the limitation of RAM memory, I followed these instructions and built a generator that draw small batch and pass them in the fit_generator of Keras. But Keras can't prepare the queue with the multiprocessing even I inherit the Sequence. Here is my generator for multiprocessing. class My_Generator(Sequence): def __init__(self, image_filenames, labels, batch_size): self.image_filenames, self.labels = image_filenames, labels self.batch_size = batch_size def __len__(self): return np.ceil

Return Inverse Hessian Matrix at the end of DNN Training and Partial Derivatives wrt the Inputs

阅读更多关于 Return Inverse Hessian Matrix at the end of DNN Training and Partial Derivatives wrt the Inputs

问题 Using Keras and Tensorflow as the backend, I have built a DNN that takes stellar spectra as an input (7213 data points) and output three stellar parameters (Temperature, gravity, and metallicity). The network trains well and predicts well on my test sets, but in order for the results to be scientifically useful, I need to be able to estimate my errors. The first step in doing this is to obtain the inverse Hessian matrix, which doesn't seem to be possible using just Keras. Therefore I am

Why does my model predict the same label?

阅读更多关于 Why does my model predict the same label?

问题 I am training a small network and the training seems to go fine, the val loss decreases, I reach validation accuracy around 80, and it actually stops training once there is no more improvement (patience=10). It trained for 40 epochs. However, it keeps predicting only one class for every test image! I tried to initialize the conv layers randomly, I added regularizers, I switched from Adam to SGD, I added clipvalue, I added dropouts. I also switched to softmax (I have only two labels but I saw

NoneType' object has no attribute '_inbound_nodes'

阅读更多关于 NoneType' object has no attribute '_inbound_nodes'

问题 Hi I am trying to build a Mixture-of-experts neural network. I found a code here: http://blog.sina.com.cn/s/blog_dc3c53e90102x9xu.html. My goal is that the gate and expert come from different data, but with same dimensions. def sliced(x,expert_num): return x[:,:,:expert_num] def reduce(x, axis): return K.sum(x, axis=axis, keepdims=True) def gatExpertLayer(inputGate, inputExpert, expert_num, nb_class): #expert_num=30 #nb_class=10 input_vector1 = Input(shape=(inputGate.shape[1:])) input_vector2

Why do we have to specify output shape during deconvolution in tensorflow?

阅读更多关于 Why do we have to specify output shape during deconvolution in tensorflow?

问题 The TF documentation has an output_shape parameter in tf.conv2d_transpose. Why is this needed? Don't the strides, filter size and padding parameters of the layer decide the output shape of that layer, similar to how it is decided during convolution? 回答1: This question was already asked on TF github and received an answer: output_shape is needed because the shape of the output can't necessarily be computed from the shape of the input, specifically if the output is smaller than the filter and