language-model

Keras Lstm predicting next item, taking whole sequences or sliding window. Will sliding window need stateful LSTM?

半城伤御伤魂 提交于 2020-12-13 03:27:07
问题 I have a sequence prediction problem in which, given the last n items in a sequence I need to predict next item. I have more than 2 million sequences each with different timesteps ( length of sequence ), like some are just 5 and some are 50/60/100/200 upto 500. seq_inputs = [ ["AA1", "BB3", "CC4",…,"DD5"], #length/timeteps 5 ["FF1", "DD3", "FF6","KK8","AA5", "CC8",…, "AA2"] #length/timeteps 50 ["AA2", "CC8", "CC11","DD3", "FF6","AA1", "BB3",……,”DD11”]#length/timesteps 200 .. .. ] # there are

Keras Lstm predicting next item, taking whole sequences or sliding window. Will sliding window need stateful LSTM?

限于喜欢 提交于 2020-12-13 03:24:24
问题 I have a sequence prediction problem in which, given the last n items in a sequence I need to predict next item. I have more than 2 million sequences each with different timesteps ( length of sequence ), like some are just 5 and some are 50/60/100/200 upto 500. seq_inputs = [ ["AA1", "BB3", "CC4",…,"DD5"], #length/timeteps 5 ["FF1", "DD3", "FF6","KK8","AA5", "CC8",…, "AA2"] #length/timeteps 50 ["AA2", "CC8", "CC11","DD3", "FF6","AA1", "BB3",……,”DD11”]#length/timesteps 200 .. .. ] # there are

Pretraining a language model on a small custom corpus

烂漫一生 提交于 2020-07-21 07:55:47
问题 I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text. For example, having a pre-trained BERT model and a small corpus of medical (or any "type") text, make a language model that is able to generate medical text. The assumption is that you do not have a huge amount of "medical texts" and that is why you have to use transfer learning. Putting it as a pipeline, I would describe this as: Using a pre-trained BERT

Pretraining a language model on a small custom corpus

筅森魡賤 提交于 2020-07-21 07:55:05
问题 I was curious if it is possible to use transfer learning in text generation, and re-train/pre-train it on a specific kind of text. For example, having a pre-trained BERT model and a small corpus of medical (or any "type") text, make a language model that is able to generate medical text. The assumption is that you do not have a huge amount of "medical texts" and that is why you have to use transfer learning. Putting it as a pipeline, I would describe this as: Using a pre-trained BERT

word2vec - what is best? add, concatenate or average word vectors?

空扰寡人 提交于 2020-05-25 06:41:22
问题 I am working on a recurrent language model. To learn word embeddings that can be used to initialize my language model, I am using gensim's word2vec model. After training, the word2vec model holds two vectors for each word in the vocabulary: the word embedding (rows of input/hidden matrix) and the context embedding (columns of hidden/output matrix). As outlined in this post there are at least three common ways to combine these two embedding vectors: summing the context and word vector for each

Calculate perplexity of word2vec model

牧云@^-^@ 提交于 2020-01-15 04:51:53
问题 I trained Gensim W2V model on 500K sentences (around 60K) words and I want to calculate the perplexity. What will be the best way to do so? for 60K words, how can I check what will be a proper amount of data? Thanks 回答1: If you want to calculate the perplexity, you have first to retrieve the loss. On the gensim.models.word2vec.Word2Vec constructor, pass the compute_loss=True parameter - this way, gensim will store the loss for you while training. Once trained, you can call the get_latest

TensorFlow Embedding Lookup

笑着哭i 提交于 2019-12-31 08:39:08
问题 I am trying to learn how to build RNN for Speech Recognition using TensorFlow. As a start, I wanted to try out some example models put up on TensorFlow page TF-RNN As per what was advised, I had taken some time to understand how word IDs are embedded into a dense representation (Vector Representation) by working through the basic version of word2vec model code. I had an understanding of what tf.nn.embedding_lookup actually does, until I actually encountered the same function being used with

Positional Encodings leads to worse convergence, language modeling

孤街醉人 提交于 2019-12-25 02:44:42
问题 This is a tough question, but I might as well try. I'm implementing the architecture from this paper https://arxiv.org/pdf/1503.08895.pdf for language modeling. See page 2 for a diagram, and the top of page 5 for the section on positional or "temporal" encoding. More on positional encoding can be found here, https://arxiv.org/pdf/1706.03762.pdf at the bottom of page 5/top of page 6. (I was directed to that second paper by the authors of the first.) So here's my keras implementation in a

CMU Sphinx4 - Custom Language Model

和自甴很熟 提交于 2019-12-24 11:27:30
问题 I have a very specific requirement. I am working on an application which will allow users to speak their employee number which is of the format HN56C12345 (any alphanumeric characters sequence) into the app. I have gone through the link: http://cmusphinx.sourceforge.net/wiki/tutoriallm but I am not sure if that would work for my usecase. So my question is three-folds : Can Sphinx4 actually recognize an alphanumeric sequence with high accuracy like an emp number in my case? If yes, can anyone