perplexity

How do i measure perplexity scores on a LDA model made with the textmineR package in R?

ぃ、小莉子 提交于 2020-07-09 05:53:10
问题 I've made a LDA topic model in R, using the textmineR package, it looks as follows. ## get textmineR dtm dtm2 <- CreateDtm(doc_vec = dat2$fulltext, # character vector of documents ngram_window = c(1, 2), doc_names = dat2$names, stopword_vec = c(stopwords::stopwords("da"), custom_stopwords), lower = T, # lowercase - this is the default value remove_punctuation = T, # punctuation - this is the default remove_numbers = T, # numbers - this is the default verbose = T, cpus = 4) dtm2 <- dtm2[,

How to interpret Sklearn LDA perplexity score. Why it always increase as number of topics increase?

牧云@^-^@ 提交于 2020-01-23 01:38:07
问题 I try to find the optimal number of topics using LDA model of sklearn. To do this I calculate perplexity by referring code on https://gist.github.com/tmylk/b71bf7d3ec2f203bfce2. But when I increase the number of topics, perplexity always increase irrationally. Am I wrong in implementations or just it gives right values? from __future__ import print_function from time import time from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer from sklearn.decomposition import NMF,

Isn't Tensorflow RNN PTB tutorial test measure and state reset wrong?

烈酒焚心 提交于 2020-01-07 05:46:07
问题 I have two question on Tensorflow PTB RNN tutorial code ptb_word_lm.py. Code blocks below are from the code. Is it okay to reset state for every batch? self._initial_state = cell.zero_state(batch_size, data_type()) with tf.device("/cpu:0"): embedding = tf.get_variable( "embedding", [vocab_size, size], dtype=data_type()) inputs = tf.nn.embedding_lookup(embedding, input_.input_data) if is_training and config.keep_prob < 1: inputs = tf.nn.dropout(inputs, config.keep_prob) outputs = [] state =

Check perplexity of a Language Model

家住魔仙堡 提交于 2019-12-20 06:17:02
问题 I created a language model with Keras LSTM and now I want to assess wether it's good so I want to calculate perplexity. What is the best way to calc perplexity of a model in Python? 回答1: I've come up with two versions and attached their corresponding source, please feel free to check the links out. def perplexity_raw(y_true, y_pred): """ The perplexity metric. Why isn't this part of Keras yet?! https://stackoverflow.com/questions/41881308/how-to-calculate-perplexity-of-rnn-in-tensorflow https

Check perplexity of a Language Model

一个人想着一个人 提交于 2019-12-02 08:29:56
I created a language model with Keras LSTM and now I want to assess wether it's good so I want to calculate perplexity. What is the best way to calc perplexity of a model in Python? I've come up with two versions and attached their corresponding source, please feel free to check the links out. def perplexity_raw(y_true, y_pred): """ The perplexity metric. Why isn't this part of Keras yet?! https://stackoverflow.com/questions/41881308/how-to-calculate-perplexity-of-rnn-in-tensorflow https://github.com/keras-team/keras/issues/8267 """ # cross_entropy = K.sparse_categorical_crossentropy(y_true, y