neural-network | 易学教程

Lack of Sparse Solution with L1 Regularization in Pytorch

阅读更多关于 Lack of Sparse Solution with L1 Regularization in Pytorch

问题 I am trying to implement L1 regularization onto the first layer of a simple neural network (1 hidden layer). I looked into some other posts on StackOverflow that apply l1 regularization using Pytorch to figure out how it should be done (references: Adding L1/L2 regularization in PyTorch?, In Pytorch, how to add L1 regularizer to activations?). No matter how high I increase lambda (the l1 regularization strength parameter) I do not get true zeros in the first weight matrix. Why would this be?

Gradient checking in backpropagation

阅读更多关于 Gradient checking in backpropagation

问题 I'm trying to implement gradient checking for a simple feedforward neural network with 2 unit input layer, 2 unit hidden layer and 1 unit output layer. What I do is the following: Take each weight w of the network weights between all layers and perform forward propagation using w + EPSILON and then w - EPSILON. Compute the numerical gradient using the results of the two feedforward propagations. What I don't understand is how exactly to perform the backpropagation. Normally, I compare the

Training with dropout

阅读更多关于 Training with dropout

问题 How are the many thinned layers resulting from dropout averaged? And which weights are to be used during the testing stage? I'm really confused about this one. Because each thinned layers would learn a different set of weights. So backpropagation is done separately for each of the thinned networks? And how exactly are weights shared among these thinned networks? Because at testing time only one neural network is used and one set of weights. So which set of weights are used? It is said that a

How to print the “actual” learning rate in Adadelta in pytorch

阅读更多关于 How to print the “actual” learning rate in Adadelta in pytorch

问题 In short : I can't draw lr/epoch curve when using adadelta optimizer in pytorch because optimizer.param_groups[0]['lr'] always return the same value. In detail : Adadelta can dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent [1]. In pytorch, the source code of Adadelta is here https://pytorch.org/docs/stable/_modules/torch/optim/adadelta.html#Adadelta Since it requires no manual tuning of learning

Keras Python Multi Image Input shape error

阅读更多关于 Keras Python Multi Image Input shape error

问题 I am trying to teach myself to build a CNN that takes more than one image as an input. Since the dataset I created to test this is large and in the long run I hope to solve a problem involving a very large dataset, I am using a generator to read images into arrays which I am passing to Keras Model's fit_generator function. When I run my generator in isolation it works fine, and produces outputs of the appropriate shape. It yields a tuple containing two entries, the first of which has shape (4

Recurrent neural networks for Time Series with Multiple Variables - TensorFlow

阅读更多关于 Recurrent neural networks for Time Series with Multiple Variables - TensorFlow

问题 I'm using previous demand to predict future demand, using 3 variables , but whenever I run the code my Y axis shows error If I use only one variable on the Y axis separately it has no error. Example: demandaY = bike_data[['cnt']] n_steps = 20 for time_step in range(1, n_steps+1): demandaY['cnt'+str(time_step)] = demandaY[['cnt']].shift(-time_step).values y = demandaY.iloc[:, 1:].values y = np.reshape(y, (y.shape[0], n_steps, 1)) DATASET SCRIPT features = ['cnt','temp','hum'] demanda = bike

indices[201] = [0,8] is out of order. Many sparse ops require sorted indices.Use `tf.sparse.reorder` to create a correctly ordered copy

阅读更多关于 indices[201] = [0,8] is out of order. Many sparse ops require sorted indices.Use `tf.sparse.reorder` to create a correctly ordered copy

问题 Im doing a neural network encoding every variable and when im going to fit the model, an error raises. indices[201] = [0,8] is out of order. Many sparse ops require sorted indices. Use `tf.sparse.reorder` to create a correctly ordered copy. [Op:SerializeManySparse] I dunno how to solve it. I can print some code here and if u want more i can still printing it def process_atributes(df, train, test): continuas = ['Trip_Duration'] cs = MinMaxScaler() trainCont = cs.fit_transform(train[continuas])

indices[201] = [0,8] is out of order. Many sparse ops require sorted indices.Use `tf.sparse.reorder` to create a correctly ordered copy

阅读更多关于 indices[201] = [0,8] is out of order. Many sparse ops require sorted indices.Use `tf.sparse.reorder` to create a correctly ordered copy

How to implement the derivative of Leaky Relu in python?

阅读更多关于 How to implement the derivative of Leaky Relu in python?

问题 How would I implement the derivative of Leaky ReLU in Python without using Tensorflow? Is there a better way than this? I want the function to return a numpy array def dlrelu(x, alpha=.01): # return alpha if x < 0 else 1 return np.array ([1 if i >= 0 else alpha for i in x]) Thanks in advance for the help 回答1: The method you use works, but strictly speaking you are computing the derivative with respect to the loss, or lower layer, so it might be wise to also pass the value from lower layer to

Neural network: How to calculate the error for a unit

阅读更多关于 Neural network: How to calculate the error for a unit

问题 I am trying to work out question 26 from this exam paper (the exam is from 2002, not one I'm getting marked on!) This is the exact question: The answer is B. Could someone point out where I'm going wrong? I worked out I1 from the previous question on the paper to be 0.982. The activation function is sigmoid. So should the sum be, for output 1: d1 = f(Ik)[1-f(Ik)](Tk-Zk) From the question: T1 = 0.58 Z1 = 0.83 T1 - Z1 = -0.25 sigmoid(I1) = sigmoid(0.982) = 0.728 1-sigmoid(I1) = 1-0.728 = 0.272