keras | 易学教程

Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

阅读更多关于 Why Bert transformer uses [CLS] token for classification instead of average over all tokens?

问题 I am doing experiments on bert architecture and found out that most of the fine-tuning task takes the final hidden layer as text representation and later they pass it to other models for the further downstream task. Bert's last layer looks like this : Where we take the [CLS] token of each sentence : Image source I went through many discussion on this huggingface issue, datascience forum question, github issue Most of the data scientist gives this explanation : BERT is bidirectional, the [CLS]

Keras initialize large embeddings layer with pretrained embeddings

阅读更多关于 Keras initialize large embeddings layer with pretrained embeddings

问题 I am trying to re-train a word2vec model in Keras 2 with Tensorflow backend by using pretrained embeddings and custom corpus. This is how I initialize the embeddings layer with pretrained embeddings: embedding = Embedding(vocab_size, embedding_dim, input_length=1, name='embedding', embeddings_initializer=lambda x: pretrained_embeddings) where pretrained_embeddings is a big matrix of size vocab_size x embedding_dim This works as long as pretrained_embeddings is not too big. In my case

Keras: change learning rate

阅读更多关于 Keras: change learning rate

问题 I'm trying to change the learning rate of my model after it has been trained with a different learning rate. I read here, here, here and some other places i can't even find anymore. I tried: model.optimizer.learning_rate.set_value(0.1) model.optimizer.lr = 0.1 model.optimizer.learning_rate = 0.1 K.set_value(model.optimizer.learning_rate, 0.1) K.set_value(model.optimizer.lr, 0.1) model.optimizer.lr.assign(0.1) ... but none of them worked! I don't understand how there could be such confusion

Change the input size in Keras

阅读更多关于 Change the input size in Keras

问题 I have trained a fully convolutional neural network with Keras. I have used the Functional API and have defined the input layer as Input(shape=(128,128,3)) , corresponding to the size of the images in my training set. However, I want to use the trained model on images of variable sizes (which should be ok because the network is fully convolutional). To do this, I need to change my input layer to Input(shape=(None,None,3)) . The obvious way to solve the problem would have been to train my

Change the input size in Keras

阅读更多关于 Change the input size in Keras

Change the input size in Keras

阅读更多关于 Change the input size in Keras

Change the input size in Keras

阅读更多关于 Change the input size in Keras

GradienTape convergence much slower than Keras.model.fit

阅读更多关于 GradienTape convergence much slower than Keras.model.fit

问题 I am currently trying to get a hold of the TF2.0 api, but as I compared the GradientTape to a regular keras.Model.fit I noticed: It ran slower(probably due to the Eager Execution) It converged much slower (and I am not sure why). +--------+--------------+--------------+------------------+ | Epoch | GradientTape | GradientTape | keras.Model.fit | | | | shuffling | | +--------+--------------+--------------+------------------+ | 1 | 0.905 | 0.918 | 0.8793 | +--------+--------------+-------------

Keras attention layer over LSTM

阅读更多关于 Keras attention layer over LSTM

问题 I'm using keras 1.0.1 I'm trying to add an attention layer on top of an LSTM. This is what I have so far, but it doesn't work. input_ = Input(shape=(input_length, input_dim)) lstm = GRU(self.HID_DIM, input_dim=input_dim, input_length = input_length, return_sequences=True)(input_) att = TimeDistributed(Dense(1)(lstm)) att = Reshape((-1, input_length))(att) att = Activation(activation="softmax")(att) att = RepeatVector(self.HID_DIM)(att) merge = Merge([att, lstm], "mul") hid = Merge("sum")

Load keras model h5 unknown metrics

阅读更多关于 Load keras model h5 unknown metrics

问题 I have trained a keras CNN monitoring the metrics as follow: METRICS = [ TruePositives(name='tp'), FalsePositives(name='fp'), TrueNegatives(name='tn'), FalseNegatives(name='fn'), BinaryAccuracy(name='accuracy'), Precision(name='precision'), Recall(name='recall'), AUC(name='auc'), ] and then the model.compile: model.compile(optimizer='nadam', loss='binary_crossentropy', metrics=METRICS) it works perfectly and I saved my h5 model (model.h5). Now I have downloaded the model and I would like to