neural-network

how to calculate a net's FLOPs in CNN

纵然是瞬间 提交于 2019-12-17 10:54:15
问题 I want to design a convolutional neural network which occupy GPU resource no more than Alexnet.I want to use FLOPs to measure it but I don't know how to calculate it.Is there any tools to do it,please? 回答1: For online tool see http://dgschwend.github.io/netscope/#/editor . For alexnet see http://dgschwend.github.io/netscope/#/preset/alexnet . This supports most wide known layers. For custom layers you will have to calculate yourself. 回答2: For future visitors, if you use Keras and TensorFlow

TimeDistributed(Dense) vs Dense in Keras - Same number of parameters

北慕城南 提交于 2019-12-17 10:42:27
问题 I'm building a model that converts a string to another string using recurrent layers (GRUs). I have tried both a Dense and a TimeDistributed(Dense) layer as the last-but-one layer, but I don't understand the difference between the two when using return_sequences=True, especially as they seem to have the same number of parameters. My simplified model is the following: InputSize = 15 MaxLen = 64 HiddenSize = 16 inputs = keras.layers.Input(shape=(MaxLen, InputSize)) x = keras.layers.recurrent

PyTorch / Gensim - How to load pre-trained word embeddings

一笑奈何 提交于 2019-12-17 10:32:39
问题 I want to load a pre-trained word2vec embedding with gensim into a PyTorch embedding layer. So my question is, how do I get the embedding weights loaded by gensim into the PyTorch embedding layer. Thanks in Advance! 回答1: I just wanted to report my findings about loading a gensim embedding with PyTorch. Solution for PyTorch 0.4.0 and newer: From v0.4.0 there is a new function from_pretrained() which makes loading an embedding very comfortable. Here is an example from the documentation. >> #

How to calculate optimal batch size

不问归期 提交于 2019-12-17 09:35:53
问题 Sometimes I run into a problem: OOM when allocating tensor with shape e.q. OOM when allocating tensor with shape (1024, 100, 160) Where 1024 is my batch size and I don't know what's the rest. If I reduce the batch size or the number of neurons in the model, it runs fine. Is there a generic way to calculate optimal batch size based on model and GPU memory, so the program doesn't crash? In short: I want the largest batch size possible in terms of my model, which will fit into my GPU memory and

Keras Sequential model input layer

不羁岁月 提交于 2019-12-17 07:38:34
问题 When creating a Sequential model in Keras, I understand you provide the input shape in the first layer. Does this input shape then make an implicit input layer? For example, the model below explicitly specifies 2 Dense layers, but is this actually a model with 3 layers consisting of one input layer implied by the input shape, one hidden dense layer with 32 neurons, and then one output layer with 10 possible outputs? model = Sequential([ Dense(32, input_shape=(784,)), Activation('relu'), Dense

How to initialize weights in PyTorch?

感情迁移 提交于 2019-12-17 07:01:37
问题 How to initialize the weights and biases (for example, with He or Xavier initialization) in a network in PyTorch? 回答1: Single layer To initialize the weights of a single layer, use a function from torch.nn.init. For instance: conv1 = torch.nn.Conv2d(...) torch.nn.init.xavier_uniform(conv1.weight) Alternatively, you can modify the parameters by writing to conv1.weight.data (which is a torch.Tensor). Example: conv1.weight.data.fill_(0.01) The same applies for biases: conv1.bias.data.fill_(0.01)

Neural network always predicts the same class

拈花ヽ惹草 提交于 2019-12-17 06:26:08
问题 I'm trying to implement a neural network that classifies images into one of the two discrete categories. The problem is, however, that it currently always predicts 0 for any input and I'm not really sure why. Here's my feature extraction method: def extract(file): # Resize and subtract mean pixel img = cv2.resize(cv2.imread(file), (224, 224)).astype(np.float32) img[:, :, 0] -= 103.939 img[:, :, 1] -= 116.779 img[:, :, 2] -= 123.68 # Normalize features img = (img.flatten() - np.mean(img)) / np

Caffe | solver.prototxt values setting strategy

余生颓废 提交于 2019-12-17 04:32:44
问题 On Caffe, I am trying to implement a Fully Convolution Network for semantic segmentation. I was wondering is there a specific strategy to set up your 'solver.prototxt' values for the following hyper-parameters: test_iter test_interval iter_size max_iter Does it depend on the number of images you have for your training set? If so, how? 回答1: In order to set these values in a meaningful manner, you need to have a few more bits of information regarding your data: 1. Training set size the total

Tackling Class Imbalance: scaling contribution to loss and sgd

主宰稳场 提交于 2019-12-17 04:28:41
问题 (An update to this question has been added.) I am a graduate student at the university of Ghent, Belgium; my research is about emotion recognition with deep convolutional neural networks. I'm using the Caffe framework to implement the CNNs. Recently I've run into a problem concerning class imbalance. I'm using 9216 training samples, approx. 5% are labeled positively (1), the remaining samples are labeled negatively (0). I'm using the SigmoidCrossEntropyLoss layer to calculate the loss. When

How to assign a value to a TensorFlow variable?

别说谁变了你拦得住时间么 提交于 2019-12-17 03:34:23
问题 I am trying to assign a new value to a tensorflow variable in python. import tensorflow as tf import numpy as np x = tf.Variable(0) init = tf.initialize_all_variables() sess = tf.InteractiveSession() sess.run(init) print(x.eval()) x.assign(1) print(x.eval()) But the output I get is 0 0 So the value has not changed. What am I missing? 回答1: In TF1, the statement x.assign(1) does not actually assign the value 1 to x , but rather creates a tf.Operation that you have to explicitly run to update