embedding

Pytorch RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_index_select

帅比萌擦擦* 提交于 2021-02-11 14:37:48
问题 I am training a model that takes tokenized strings which are then passed through an embedding layer and an LSTM thereafter. However, there seems to be an error in the input, as it does not pass through the embedding layer. class DrugModel(nn.Module): def __init__(self, input_dim, output_dim, hidden_dim, drug_embed_dim, lstm_layer, lstm_dropout, bi_lstm, linear_dropout, char_vocab_size, char_embed_dim, char_dropout, dist_fn, learning_rate, binary, is_mlp, weight_decay, is_graph, g_layer, g

Pytorch RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _th_index_select

折月煮酒 提交于 2021-02-11 14:35:05
问题 I am training a model that takes tokenized strings which are then passed through an embedding layer and an LSTM thereafter. However, there seems to be an error in the input, as it does not pass through the embedding layer. class DrugModel(nn.Module): def __init__(self, input_dim, output_dim, hidden_dim, drug_embed_dim, lstm_layer, lstm_dropout, bi_lstm, linear_dropout, char_vocab_size, char_embed_dim, char_dropout, dist_fn, learning_rate, binary, is_mlp, weight_decay, is_graph, g_layer, g

BERT sentence embeddings: how to obtain sentence embeddings vector

天大地大妈咪最大 提交于 2021-02-11 13:41:14
问题 I'm using the module bert-for-tf2 in order to wrap BERT model as Keras layer in Tensorflow 2.0 I've followed your guide for implementing BERT model as Keras layer. I'm trying to extract embeddings from a sentence; in my case, the sentence is "Hello" I have a question about the output of the model prediction; I've written this model: model_word_embedding = tf.keras.Sequential([ tf.keras.layers.Input(shape=(4,), dtype='int32', name='input_ids'), bert_layer ]) model_word_embedding .build(input

FastText - Cannot load model.bin due to C++ extension failed to allocate the memory

随声附和 提交于 2021-02-07 05:58:07
问题 I'm trying to use the FastText Python API https://pypi.python.org/pypi/fasttext Although, from what I've read, this API can't load the newer .bin model files at https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md as suggested in https://github.com/salestock/fastText.py/issues/115 I've tried everything that is suggested at that issue, and furthermore https://github.com/Kyubyong/wordvectors doesn't have the .bin for English, otherwise the problem would be solved. Does

Where can I get the pretrained word embeddinngs for BERT?

蓝咒 提交于 2021-01-20 11:57:06
问题 I know that BERT has total vocabulary size of 30522 which contains some words and subwords. I want to get the initial input embeddings of BERT. So, my requirement is to get the table of size [30522, 768] to which I can index by token id to get its embeddings. Where can I get this table? 回答1: The BertModels have get_input_embeddings(): import torch from transformers import BertModel, BertTokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') bert = BertModel.from_pretrained(

After training word embedding with gensim's fasttext's wrapper, how to embed new sentences?

浪子不回头ぞ 提交于 2021-01-07 03:56:25
问题 After reading the tutorial at gensim's docs, I do not understand what is the correct way of generating new embeddings from a trained model. So far I have trained gensim's fast text embeddings like this: from gensim.models.fasttext import FastText as FT_gensim model_gensim = FT_gensim(size=100) # build the vocabulary model_gensim.build_vocab(corpus_file=corpus_file) # train the model model_gensim.train( corpus_file=corpus_file, epochs=model_gensim.epochs, total_examples=model_gensim.corpus

After training word embedding with gensim's fasttext's wrapper, how to embed new sentences?

隐身守侯 提交于 2021-01-07 03:56:06
问题 After reading the tutorial at gensim's docs, I do not understand what is the correct way of generating new embeddings from a trained model. So far I have trained gensim's fast text embeddings like this: from gensim.models.fasttext import FastText as FT_gensim model_gensim = FT_gensim(size=100) # build the vocabulary model_gensim.build_vocab(corpus_file=corpus_file) # train the model model_gensim.train( corpus_file=corpus_file, epochs=model_gensim.epochs, total_examples=model_gensim.corpus