softmax

Tensorflow seq2seq get sequence hidden state

匿名 (未验证) 提交于 2019-12-03 08:48:34
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I just started to work on tensorflow not long ago. I'm working on the seq2seq model and somehow got the tutorial to work, but I'm stuck at getting the states of each sentence. As far as I understand, the seq2seq model takes an input sequence and generates a hidden state for the sequence through RNN. Later, the model uses the sequence's hidden state to generate a new sequence of data. My problem is what should I do if I want to use the hidden state of input sequence directly? Say for example if I have a trained model, how should I get the

How to do weighted softmax output custom op in mxnet?

匿名 (未验证) 提交于 2019-12-03 07:36:14
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I want to replace mx.symbol.SoftmaxOutput with the weighted version (assign different weight respect to label's frequency in the whole dataset) The original function works well like below: cls_prob = mx.symbol.SoftmaxOutput(data=data, label=label, multi_output=True, normalization='valid', use_ignore=True, ignore_label=-1, name='cls_prob') The current code I wrote as below. The code can run without errors, but the loss quickly explode to nan. I am dealing with detection problem, RCNNL1 loss with quickly become nan when I use my code as

CS231n: How to calculate gradient for Softmax loss function?

被刻印的时光 ゝ 提交于 2019-12-03 04:53:54
问题 I am watching some videos for Stanford CS231: Convolutional Neural Networks for Visual Recognition but do not quite understand how to calculate analytical gradient for softmax loss function using numpy . From this stackexchange answer, softmax gradient is calculated as: Python implementation for above is: num_classes = W.shape[0] num_train = X.shape[1] for i in range(num_train): for j in range(num_classes): p = np.exp(f_i[j])/sum_i dW[j, :] += (p-(j == y[i])) * X[:, i] Could anyone explain

How to implement the Softmax derivative independently from any loss function?

匿名 (未验证) 提交于 2019-12-03 02:52:02
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: For a neural networks library I implemented some activation functions and loss functions and their derivatives. They can be combined arbitrarily and the derivative at the output layers just becomes the product of the loss derivative and the activation derivative. However, I failed to implement the derivative of the Softmax activation function independently from any loss function. Due to the normalization i.e. the denominator in the equation, changing a single input activation changes all output activations and not just one. Here is my

Undefined symbols for architecture x86_64: for caffe build

匿名 (未验证) 提交于 2019-12-03 02:49:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I got this error for caffe build. How can I fix it? I'm using Mac OSX Yosemite 10.10.1. CONSOLE LOG Machida-no-MacBook-Air:caffe machidahiroaki$ /usr/bin/clang++ -shared -o .build_release/lib/libcaffe.so .build_release/src/caffe/proto/caffe.pb.o .build_release/src/caffe/proto/caffe_pretty_print.pb.o .build_release/src/caffe/blob.o .build_release/src/caffe/common.o .build_release/src/caffe/data_transformer.o .build_release/src/caffe/dataset_factory.o .build_release/src/caffe/internal_thread.o .build_release/src/caffe/layer_factory.o .build

How to implement the Softmax derivative independently from any loss function?

前提是你 提交于 2019-12-03 02:26:52
For a neural networks library I implemented some activation functions and loss functions and their derivatives. They can be combined arbitrarily and the derivative at the output layers just becomes the product of the loss derivative and the activation derivative. However, I failed to implement the derivative of the Softmax activation function independently from any loss function. Due to the normalization i.e. the denominator in the equation, changing a single input activation changes all output activations and not just one. Here is my Softmax implementation where the derivative fails the

Tensorflow negative sampling

匿名 (未验证) 提交于 2019-12-03 02:24:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am trying to follow the udacity tutorial on tensorflow where I came across the following two lines for word embedding models: # Look up embeddings for inputs. embed = tf.nn.embedding_lookup(embeddings, train_dataset) # Compute the softmax loss, using a sample of the negative labels each time. loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed, train_labels, num_sampled, vocabulary_size)) Now I understand that the second statement is for sampling negative labels. But the question is how does it know what

Tensor is not an element of this graph

匿名 (未验证) 提交于 2019-12-03 02:22:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I'm getting this error 'ValueError: Tensor Tensor("Placeholder:0", shape=(1, 1), dtype=int32) is not an element of this graph.' The code is running perfectly fine without with tf.Graph(). as_default(): . However I need to call M.sample(...) multiple times and each time the memory won't be free after session.close() . Probably there is a memory leak but not sure where is it. I want to restore a pre-trained neural network, set it as default graph, and testing it multiple times (like 10000) over the default graph without making it larger each

numpy : calculate the derivative of the softmax function

匿名 (未验证) 提交于 2019-12-03 02:14:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I am trying to understand backpropagation in a simple 3 layered neural network with MNIST . There is the input layer with weights and a bias . The labels are MNIST so it's a 10 class vector. The second layer is a linear tranform . The third layer is the softmax activation to get the output as probabilities. Backpropagation calculates the derivative at each step and call this the gradient. Previous layers appends the global or previous gradient to the local gradient . I am having trouble calculating the local gradient of the softmax Several

TensorFlow/TFLearn: ValueError: Cannot feed value of shape (64,) for Tensor u'target/Y:0', which has shape '(?, 10)'

匿名 (未验证) 提交于 2019-12-03 02:06:01
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: I have been trying to perform regression using tflearn and my own dataset. Using tflearn I have been trying to implement a convolutional network based off an example using the MNIST dataset. Instead of using the MNIST dataset I have tried replacing the training and test data with my own. My data is read in from a csv file and is a different shape to the MNIST data. I have 255 features which represent a 15*15 grid and a target value. In the example I replaced the lines 24-30 with (and included import numpy as np): #read in train and test csv