backpropagation | 易学教程

keras combining two losses with adjustable weights

阅读更多关于 keras combining two losses with adjustable weights

So here is the detail description. I have a keras functional model with two layers with outputs x1 and x2. x1 = Dense(1,activation='relu')(prev_inp1) x2 = Dense(2,activation='relu')(prev_inp2) I need to use these x1 and x2, Merge/add Them and come up with weighted loss function like in the attached image. Propagate the 'same loss' into both branches. Alpha is flexible to vary with iterations It seems that propagating the "same loss" into both branches will not take effect, unless alpha is dependent on both branches. If alpha is not variable depending on both branches, then part of the loss

XOR neural network error stops decreasing during training

阅读更多关于 XOR neural network error stops decreasing during training

I'm training a XOR neural network via back-propagation using stochastic gradient descent. The weights of the neural network are initialized to random values between -0.5 and 0.5. The neural network successfully trains itself around 80% of the time. However sometimes it gets "stuck" while backpropagating. By "stuck", I mean that I start seeing a decreasing rate of error correction. For example, during a successful training, the total error decreases rather quickly as the network learns, like so: ... ... Total error for this training set: 0.0010008071327708653 Total error for this training set:

Why is a bias neuron necessary for a backpropagating neural network that recognizes the XOR operator?

阅读更多关于 Why is a bias neuron necessary for a backpropagating neural network that recognizes the XOR operator?

问题 I posted a question yesterday regarding issues that I was having with my backpropagating neural network for the XOR operator. I did a little more work and realized that it may have to do with not having a bias neuron. My question is, what is the role of the bias neuron in general, and what is its role in a backpropagating neural network that recognizes the XOR operator? Is it possible to create one without a bias neuron? 回答1: It's possible to create a neural network without a bias neuron...

When to use in-place layers in Caffe?

阅读更多关于 When to use in-place layers in Caffe?

问题 By setting the bottom and the top blob to be the same we can tell Caffe to do "in-place" computation to preserve memory consumption. Currently I know I can safely use in-place "BatchNorm" , "Scale" and "ReLU" layers (please let me know if I'm wrong). While it seems to have some issues for other layers (this issue seems to be an example). When to use in-place layers in Caffe? How does it work with back-propagation? 回答1: As you well noted, in-place layers don't usually work "out of the box".

Implementing back propagation using numpy and python for cleveland dataset

阅读更多关于 Implementing back propagation using numpy and python for cleveland dataset

问题 I wanted to predict heart disease using backpropagation algorithm for neural networks. For this I used UCI heart disease data set linked here: processed cleveland. To do this, I used the cde found on the following blog: Build a flexible Neural Network with Backpropagation in Python and changed it little bit according to my own dataset. My code is as follows: import numpy as np import csv reader = csv.reader(open("cleveland_data.csv"), delimiter=",") x = list(reader) result = np.array(x)

Neural network backpropagation with RELU

阅读更多关于 Neural network backpropagation with RELU

I am trying to implement neural network with RELU. input layer -> 1 hidden layer -> relu -> output layer -> softmax layer Above is the architecture of my neural network. I am confused about backpropagation of this relu. For derivative of RELU, if x <= 0, output is 0. if x > 0, output is 1. So when you calculate the gradient, does that mean I kill gradient decent if x<=0? Can someone explain the backpropagation of my neural network architecture 'step by step'? if x <= 0, output is 0. if x > 0, output is 1 The ReLU function is defined as: For x > 0 the output is x, i.e. f(x) = max(0,x) So for

Understanding Neural Network Backpropagation

阅读更多关于 Understanding Neural Network Backpropagation

Update: a better formulation of the issue. I'm trying to understand the backpropagation algorithm with an XOR neural network as an example. For this case there are 2 input neurons + 1 bias, 2 neurons in the hidden layer + 1 bias, and 1 output neuron. A B A XOR B 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 (source: wikimedia.org ) I'm using stochastic backpropagation . After reading a bit more I have found out that the error of the output unit is propagated to the hidden layers... initially this was confusing, because when you get to the input layer of the neural network, then each neuron gets an error

How to build a multiple input graph with tensor flow?

阅读更多关于 How to build a multiple input graph with tensor flow?

问题 is it possible to define a TensorFlow graph with more than one input? For instance, I want to give the graph two images and one text, each one is processed by a bunch of layers with a fc layer at the end. Then there is a node that computes a lossy function that takes into account the three representations. The aim is to let the three nets to backpropagate considering the joint representation lossy. Is it possible? any example/tutorial about it? thanks in advance! 回答1: This is completely

Resilient backpropagation neural network - question about gradient

阅读更多关于 Resilient backpropagation neural network - question about gradient

问题 First I want to say that I'm really new to neural networks and I don't understand it very good ;) I've made my first C# implementation of the backpropagation neural network. I've tested it using XOR and it looks it work. Now I would like change my implementation to use resilient backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop). The definition says: "Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on

How does the back-propagation algorithm deal with non-differentiable activation functions?

阅读更多关于 How does the back-propagation algorithm deal with non-differentiable activation functions?

while digging through the topic of neural networks and how to efficiently train them I came across the method of using very simple activation functions, such as the recified linear unit (ReLU), instead of the classic smooth sigmoids. The ReLU-function is not differentiable at the origin, so according to my understanding the backpropagation algorithm (BPA) is not suitable for training a neural network with ReLUs, since the chain rule of multivariable calculus refers to smooth functions only. However, none of the papers about using ReLUs that I read address this issue. ReLUs seem to be very