keras

Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

旧巷老猫 提交于 2020-12-01 12:00:35
问题 I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this : Where we take the [CLS] token of each sentence : Image source I went through many discussion on this huggingface issue, datascience forum question, github issue Most of the data scientist gives this explanation : BERT is bidirectional, the [CLS]

Keras initialize large embeddings layer with pretrained embeddings

心已入冬 提交于 2020-12-01 06:12:50
问题 I am trying to re-train a word2vec model in Keras 2 with Tensorflow backend by using pretrained embeddings and custom corpus. This is how I initialize the embeddings layer with pretrained embeddings: embedding = Embedding(vocab_size, embedding_dim, input_length=1, name='embedding', embeddings_initializer=lambda x: pretrained_embeddings) where pretrained_embeddings is a big matrix of size vocab_size x embedding_dim This works as long as pretrained_embeddings is not too big. In my case

Keras: change learning rate

倖福魔咒の 提交于 2020-12-01 03:41:40
问题 I'm trying to change the learning rate of my model after it has been trained with a different learning rate. I read here, here, here and some other places i can't even find anymore. I tried: model.optimizer.learning_rate.set_value(0.1) model.optimizer.lr = 0.1 model.optimizer.learning_rate = 0.1 K.set_value(model.optimizer.learning_rate, 0.1) K.set_value(model.optimizer.lr, 0.1) model.optimizer.lr.assign(0.1) ... but none of them worked! I don't understand how there could be such confusion

Change the input size in Keras

前提是你 提交于 2020-11-30 16:58:06
问题 I have trained a fully convolutional neural network with Keras. I have used the Functional API and have defined the input layer as Input(shape=(128,128,3)) , corresponding to the size of the images in my training set. However, I want to use the trained model on images of variable sizes (which should be ok because the network is fully convolutional). To do this, I need to change my input layer to Input(shape=(None,None,3)) . The obvious way to solve the problem would have been to train my

Change the input size in Keras

守給你的承諾、 提交于 2020-11-30 16:58:04
问题 I have trained a fully convolutional neural network with Keras. I have used the Functional API and have defined the input layer as Input(shape=(128,128,3)) , corresponding to the size of the images in my training set. However, I want to use the trained model on images of variable sizes (which should be ok because the network is fully convolutional). To do this, I need to change my input layer to Input(shape=(None,None,3)) . The obvious way to solve the problem would have been to train my

Change the input size in Keras

我们两清 提交于 2020-11-30 16:48:30
问题 I have trained a fully convolutional neural network with Keras. I have used the Functional API and have defined the input layer as Input(shape=(128,128,3)) , corresponding to the size of the images in my training set. However, I want to use the trained model on images of variable sizes (which should be ok because the network is fully convolutional). To do this, I need to change my input layer to Input(shape=(None,None,3)) . The obvious way to solve the problem would have been to train my

Change the input size in Keras

笑着哭i 提交于 2020-11-30 16:48:24
问题 I have trained a fully convolutional neural network with Keras. I have used the Functional API and have defined the input layer as Input(shape=(128,128,3)) , corresponding to the size of the images in my training set. However, I want to use the trained model on images of variable sizes (which should be ok because the network is fully convolutional). To do this, I need to change my input layer to Input(shape=(None,None,3)) . The obvious way to solve the problem would have been to train my

GradienTape convergence much slower than Keras.model.fit

蹲街弑〆低调 提交于 2020-11-30 12:25:09
问题 I am currently trying to get a hold of the TF2.0 api, but as I compared the GradientTape to a regular keras.Model.fit I noticed: It ran slower(probably due to the Eager Execution) It converged much slower (and I am not sure why). +--------+--------------+--------------+------------------+ | Epoch | GradientTape | GradientTape | keras.Model.fit | | | | shuffling | | +--------+--------------+--------------+------------------+ | 1 | 0.905 | 0.918 | 0.8793 | +--------+--------------+-------------

Keras attention layer over LSTM

旧街凉风 提交于 2020-11-30 06:47:25
问题 I'm using keras 1.0.1 I'm trying to add an attention layer on top of an LSTM. This is what I have so far, but it doesn't work. input_ = Input(shape=(input_length, input_dim)) lstm = GRU(self.HID_DIM, input_dim=input_dim, input_length = input_length, return_sequences=True)(input_) att = TimeDistributed(Dense(1)(lstm)) att = Reshape((-1, input_length))(att) att = Activation(activation="softmax")(att) att = RepeatVector(self.HID_DIM)(att) merge = Merge([att, lstm], "mul") hid = Merge("sum")

Load keras model h5 unknown metrics

岁酱吖の 提交于 2020-11-29 19:14:44
问题 I have trained a keras CNN monitoring the metrics as follow: METRICS = [ TruePositives(name='tp'), FalsePositives(name='fp'), TrueNegatives(name='tn'), FalseNegatives(name='fn'), BinaryAccuracy(name='accuracy'), Precision(name='precision'), Recall(name='recall'), AUC(name='auc'), ] and then the model.compile: model.compile(optimizer='nadam', loss='binary_crossentropy', metrics=METRICS) it works perfectly and I saved my h5 model (model.h5). Now I have downloaded the model and I would like to