backpropagation | 易学教程

LSTM RNN Backpropagation

阅读更多关于 LSTM RNN Backpropagation

来源： https://stackoverflow.com/questions/41555576/lstm-rnn-backpropagation

Extremely small or NaN values appear in training neural network

阅读更多关于 Extremely small or NaN values appear in training neural network

来源： https://stackoverflow.com/questions/44686609/extremely-small-or-nan-values-appear-in-training-neural-network

How to apply Guided BackProp in Tensorflow 2.0?

阅读更多关于 How to apply Guided BackProp in Tensorflow 2.0?

问题 I am starting with Tensorflow 2.0 and trying to implement Guided BackProp to display Saliency Map. I started by computing the loss between y_pred and y_true of an image, then find gradients of all layers due to this loss. with tf.GradientTape() as tape: logits = model(tf.cast(image_batch_val, dtype=tf.float32)) print('`logits` has type {0}'.format(type(logits))) xentropy = tf.nn.softmax_cross_entropy_with_logits(labels=tf.cast(tf.one_hot(1-label_batch_val, depth=2), dtype=tf.int32), logits

Forward vs reverse mode differentiation - Pytorch

阅读更多关于 Forward vs reverse mode differentiation - Pytorch

问题 In the first example of Learning PyTorch with Examples, the author demonstrates how to create a neural network with numpy. Their code is pasted below for convenience: # from: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html # -*- coding: utf-8 -*- import numpy as np # N is batch size; D_in is input dimension; # H is hidden dimension; D_out is output dimension. N, D_in, H, D_out = 64, 1000, 100, 10 # Create random input and output data x = np.random.randn(N, D_in) y = np

What is the difference between backpropagation and reverse-mode autodiff?

阅读更多关于 What is the difference between backpropagation and reverse-mode autodiff?

问题 Going through this book, I am familiar with the following: For each training instance the backpropagation algorithm first makes a prediction (forward pass), measures the error, then goes through each layer in reverse to measure the error contribution from each connection (reverse pass), and finally slightly tweaks the connection weights to reduce the error. However I am not sure how this differs from the reverse-mode autodiff implementation by TensorFlow. As far as I know reverse-mode

What is the difference between backpropagation and reverse-mode autodiff?

阅读更多关于 What is the difference between backpropagation and reverse-mode autodiff?

Using stop_gradient with AdamOptimizer in TensorFlow

阅读更多关于 Using stop_gradient with AdamOptimizer in TensorFlow

问题 I am trying to implement a training/finetuning framework when in each backpropagation iteration a certain set of parameters stay fixed. I want to be able to change the set of updating or fixed parameters from iteration to iteration. TensorFlow method tf.stop_gradient, which apparently forces gradients of some parameters to stay zero, is very useful for this purpose and it works perfectly fine with different optimizers if the set of updating or fixed parameters do not change from iterations to

Using stop_gradient with AdamOptimizer in TensorFlow

阅读更多关于 Using stop_gradient with AdamOptimizer in TensorFlow

Neural Network and Temporal Difference Learning

阅读更多关于 Neural Network and Temporal Difference Learning

问题 I have a read few papers and lectures on temporal difference learning (some as they pertain to neural nets, such as the Sutton tutorial on TD-Gammon) but I am having a difficult time understanding the equations, which leads me to my questions. -Where does the prediction value V_t come from? And subsequently, how do we get V_(t+1)? -What exactly is getting back propagated when TD is used with a neural net? That is, where does the error that gets back propagated come from when using TD? 回答1:

How Many Epochs Should a Neural Net Need to Learn to Square? (Testing Results Included)

阅读更多关于 How Many Epochs Should a Neural Net Need to Learn to Square? (Testing Results Included)

问题 Okay, let me preface this by saying that I am well aware that this depends on MANY factors, I'm looking for some general guidelines from people with experience. My goal is not to make a Neural Net that can compute squares of numbers for me, but I thought it would be a good experiment to see if I implemented the Backpropagation algorithm correctly. Does this seem like a good idea? Anyways, I am worried that I have not implemented the learning algorithm (fully) correctly. My Testing (Results):