how to get a probability distribution for a topic in mallet?

自古美人都是妖i 提交于 2019-12-02 19:31:16
问题 Using mallet I can get a specific number of topics and their words. How can I make sure topic words make a probability distribution (ie sum to one)? For example if I run it as bellow, how can I use the outputs given by mallet to make sure probabilities of topic words for topic 0 adds up to 1? mallet train-topics --input text.vectors --output-topic-keys topics.txt --output-doc-topics doc_comp.txt --topic-word-weights-file weights.txt --num-top-words 50 --word-topic-counts-file counts.txt --num

Remove empty documents from DocumentTermMatrix in R topicmodels?

我怕爱的太早我们不能终老 提交于 2019-11-27 06:37:34
I am doing topic modelling using the topicmodels package in R. I am creating a Corpus object, doing some basic preprocessing, and then creating a DocumentTermMatrix: corpus <- Corpus(VectorSource(vec), readerControl=list(language="en")) corpus <- tm_map(corpus, tolower) corpus <- tm_map(corpus,