backpropagation | 易学教程

How Many Epochs Should a Neural Net Need to Learn to Square? (Testing Results Included)

阅读更多关于 How Many Epochs Should a Neural Net Need to Learn to Square? (Testing Results Included)

问题 Okay, let me preface this by saying that I am well aware that this depends on MANY factors, I'm looking for some general guidelines from people with experience. My goal is not to make a Neural Net that can compute squares of numbers for me, but I thought it would be a good experiment to see if I implemented the Backpropagation algorithm correctly. Does this seem like a good idea? Anyways, I am worried that I have not implemented the learning algorithm (fully) correctly. My Testing (Results):

PyTorch: RuntimeError: Function MulBackward0 returned an invalid gradient at index 0 - expected type torch.cuda.FloatTensor but got torch.FloatTensor

阅读更多关于 PyTorch: RuntimeError: Function MulBackward0 returned an invalid gradient at index 0 - expected type torch.cuda.FloatTensor but got torch.FloatTensor

问题 I don't understand what this error is telling me. In a different post the same problem was also addressed but there was no useful solution for this. Traceback (most recent call last): File "train.py", line 252, in <module> main() File "train.py", line 231, in main train(net, training_dataset, targets, device, criterion, optimizer, epoch, args.epochs) File "train.py", line 103, in train loss.backward() File "/home/hb119056/.local/lib/python3.6/site-packages/torch/tensor.py", line 107, in

BackPropagation Neuron Network Approach - Design

阅读更多关于 BackPropagation Neuron Network Approach - Design

问题 I am trying to make a digit recognition program. I shall feed a white/black image of a digit and my output layer will fire the corresponding digit (one neuron shall fire, out of the 0 -> 9 neurons in the Output Layer). I finished implementing a Two-dimensional BackPropagation Neuron Network. My topology sizes are [5][3] -> [3][3] -> 1[10]. So it's One 2-D Input Layer, One 2-D Hidden Layer and One 1-D Output Layer. However I am getting weird and wrong results (Average Error and Output Values).

Backpropagation for rectified linear unit activation with cross entropy error

阅读更多关于 Backpropagation for rectified linear unit activation with cross entropy error

问题 I'm trying to implement gradient calculation for neural networks using backpropagation. I cannot get it to work with cross entropy error and rectified linear unit (ReLU) as activation. I managed to get my implementation working for squared error with sigmoid, tanh and ReLU activation functions. Cross entropy (CE) error with sigmoid activation gradient is computed correctly. However, when I change activation to ReLU - it fails. (I'm skipping tanh for CE as it retuls values in (-1,1) range.) Is

How does keras handle multiple losses?

阅读更多关于 How does keras handle multiple losses?

问题 So my question is, if I have something like: model = Model(inputs = input, outputs = [y1,y2]) l1 = 0.5 l2 = 0.3 model.compile(loss = [loss1,loss2], loss_weights = [l1,l2], ...) What does keras do with the losses to obtain the final loss? Is it something like: final_loss = l1*loss1 + l2*loss2 Also, what does it mean during training? Is the loss2 only used to update the weights on layers where y2 comes from? Or is it used for all the model's layers? I'm pretty confused 回答1: From model

can't find the inplace operation: one of the variables needed for gradient computation has been modified by an inplace operation

阅读更多关于 can't find the inplace operation: one of the variables needed for gradient computation has been modified by an inplace operation

问题 I am trying to compute a loss on the jacobian of the network (i.e. to perform double backprop), and I get the following error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation I can't find the inplace operation in my code, so I don't know which line to fix. *The error occurs in the last line: loss3.backward() inputs_reg = Variable(data, requires_grad=True) output_reg = self.model.forward(inputs_reg) num_classes = output.size()[1]

反向传播（Backpropagation）算法的数学原理

阅读更多关于反向传播（Backpropagation）算法的数学原理

【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> 准备本文不是一篇引导读者入门的科普文。读者至少需要对人工神经网络有概念性的理解，并且熟悉偏导数。这是一个简单的人工神经网络，分为输入层，隐藏层和输出层。输入层以原始数据 x 作为本层向下一层的输出，即 a (1) ；隐藏层将 a (1) 中的元素进行线性组合作为自己的输入，即 z (2) ，然后将 z (2) 进行一个数学变换（函数 g ( z )）作为对下一层的输出，即 a (2) 。以此类推，得到最后一层的输出，即 a (3) 或 hΘ( x ) ，就是人工神经网络的运算结果。其中 Θ(1), Θ(2) 分别是第1层，第2层向前传递数据时的线性组合参数，把 Θ(1), Θ(2) 统称为 Θ 。在训练时，将已经标记好标准答案（称标准答案为 y ）的数据输入网络，刚开始网络的输出与答案往往相距甚远，然后用优化算法（例如梯度下降法）不断修正 Θ ，使网络的计算结果越来越接近标准答案，直至达到要求的精度。这里定义一个表达式，叫做【代价方程】（Cost Function）： J(Θ) = 【计算结果与标准答案之间的差距】不需要关心 Cost Function 的具体形式，只需要理解它是对计算误差的一种衡量，并且它是 Θ 的函数。我们的目标就是使计算误差尽量小。

Training feedforward neural network for OCR [closed]

阅读更多关于 Training feedforward neural network for OCR [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed last year . Currently I'm learning about neural networks and I'm trying to create an application that can be trained to recognize handwritten characters. For this problem I use a feed-forward neural network and it seems to work when I train it to recognize 1, 2 or 3 different characters. But

Why is it in Pytorch when I make a COPY of a network's weight it would be automatically updated after back-propagation?

阅读更多关于 Why is it in Pytorch when I make a COPY of a network's weight it would be automatically updated after back-propagation?

问题 I wrote the following code as a test because in my original network I use a ModuleDict and depends on what index I feed it would slice and train only parts of that network. I wanted to make sure that only the sliced layers would update their weight so I wrote some test code to double check. Well I am getting some weird results. Say if my model has 2 layers, layer1 is an FC and layer 2 is Conv2d, if I slice the network and ONLY use layer2 I would expect layer1's weight to be unchanged because

Backpropagation and batch training

阅读更多关于 Backpropagation and batch training

问题 Backpropagation calculates dW (weight delta) per weight per pattern, so it's straightforward how to modify weights when doing stochastic training. How do I use it for batch training, though? Simply accumluate dW over the entire training set and then apply the modfication, or is there more to it? 回答1: Yes, just accumluate dW over the entire training set. At least that is how I coded it back in grad school... 回答2: You can do a lot with the different gradients from the different samples. That