backpropagation | 易学教程

Tune input features using backprop in keras

阅读更多关于 Tune input features using backprop in keras

I am trying to implement discriminant condition codes in Keras as proposed in Xue, Shaofei, et al., "Fast adaptation of deep neural network based on discriminant codes for speech recognition." The main idea is you encode each condition as an input parameter and let the network learn dependency between the condition and the feature-label mapping. On a new dataset instead of adapting the entire network you just tune these weights using backprop. For example say my network looks like this X ---->|----| |DNN |----> Y Z --- >|----| X : features Y : labels Z :condition codes Now given a pretrained

Can I (selectively) invert Theano gradients during backpropagation?

阅读更多关于 Can I (selectively) invert Theano gradients during backpropagation?

问题 I'm keen to make use of the architecture proposed in the recent paper "Unsupervised Domain Adaptation by Backpropagation" in the Lasagne/Theano framework. The thing about this paper that makes it a bit unusual is that it incorporates a 'gradient reversal layer', which inverts the gradient during backpropagation: (The arrows along the bottom of the image are the backpropagations which have their gradient inverted). In the paper the authors claim that the approach "can be implemented using any

How does tensorflow handle non differentiable nodes during gradient calculation?

阅读更多关于 How does tensorflow handle non differentiable nodes during gradient calculation?

I understood the concept of automatic differentiation, but couldn't find any explanation how tensorflow calculates the error gradient for non differentiable functions as for example tf.where in my loss function or tf.cond in my graph. It works just fine, but I would like to understand how tensorflow backpropagates the error through such nodes, since there is no formula to calculate the gradient from them. In the case of tf.where , you have a function with three inputs, condition C , value on true T and value on false F , and one output Out . The gradient receives one value and has to return

Looping through training data in Neural Networks Backpropagation Algorithm

阅读更多关于 Looping through training data in Neural Networks Backpropagation Algorithm

How many times do I use a sample of training data in one training cycle? Say I have 60 training data. I go through the 1st row and do a forward pass and adjust weights using results from backward pass. Using the sigmoidal function as below: Forward pass Si = sum of (Wi * Uj) Ui = f(Si) = 1 / 1 + e^ - Si Backward pass Output Cell = (expected -Ui)(f'(Si)), where f'(Si) = Ui(1-Ui) Do I then go through the 2nd row and do the same process as the 1st or do I go around the 1st row until the error is less? I hope someone can help please Training the network You should use each instance of the training

Part 2 Resilient backpropagation neural network

阅读更多关于 Part 2 Resilient backpropagation neural network

问题 This is a follow-on question to this post. For a given neuron, I'm unclear as to how to take a partial derivative of its error and the partial derivative of it's weight. Working from this web page, it's clear how the propogation works (although I'm dealing with Resilient Propagation). For a Feedforward Neural Network, we have to 1) while moving forwards through the neural net, trigger neurons, 2) from the output layer neurons, calculate a total error. Then 3) moving backwards, propogate that

Multi-layer neural network won't predict negative values

阅读更多关于 Multi-layer neural network won't predict negative values

I have implemented a multilayer perceptron to predict the sin of input vectors. The vectors consist of four -1,0,1's chosen at random and a bias set to 1. The network should predict the sin of sum of the vectors contents. eg Input = <0,1,-1,0,1> Output = Sin(0+1+(-1)+0+1) The problem I am having is that the network will never predict a negative value and many of the vectors' sin values are negative. It predicts all positive or zero outputs perfectly. I am presuming that there is a problem with updating the weights, which are updated after every epoch. Has anyone encountered this problem with

How to convert deep learning gradient descent equation into python

阅读更多关于 How to convert deep learning gradient descent equation into python

I've been following an online tutorial on deep learning. It has a practical question on gradient descent and cost calculations where I been struggling to get the given answers once it was converted to python code. Hope you can kindly help me get the correct answer please Please see the following link for the equations used Click here to see the equations used for the calculations Following is the function given to calculate the gradient descent,cost etc. The values need to be found without using for loops but using matrix manipulation operations import numpy as np def propagate(w, b, X, Y): ""

Backpropagation Algorithm Implementation

阅读更多关于 Backpropagation Algorithm Implementation

Dea All, I am trying to implement a neural network which uses backpropagation. So far I got to the stage where each neuron receives weighted inputs from all neurons in the previous layer, calculates the sigmoid function based on their sum and distributes it across the following layer. Finally, the entire network produces a result O. A then calculate the error as E = 1/2(D-O)^2 where D is the desired value. At this point, having all neurons across the network their individual output and the overall error of the net, how can I backpropagate it to adjusts the weights? Cheers :) I would highly

Implementing a perceptron with backpropagation algorithm

阅读更多关于 Implementing a perceptron with backpropagation algorithm

问题 I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence. First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value. So, basically I set

Backpropagation for rectified linear unit activation with cross entropy error

阅读更多关于 Backpropagation for rectified linear unit activation with cross entropy error

I'm trying to implement gradient calculation for neural networks using backpropagation. I cannot get it to work with cross entropy error and rectified linear unit (ReLU) as activation. I managed to get my implementation working for squared error with sigmoid, tanh and ReLU activation functions. Cross entropy (CE) error with sigmoid activation gradient is computed correctly. However, when I change activation to ReLU - it fails. (I'm skipping tanh for CE as it retuls values in (-1,1) range.) Is it because of the behavior of log function at values close to 0 (which is returned by ReLUs approx. 50