发表新帖

发表新帖

How to understand the output of Topic Model class in Mallet?

前端未结

关注

 3  588

难免孤独 2021-02-03 11:49

As I\'m trying out the examples code on topic modeling developer\'s guide, I really want to understand the meaning of the output of that code.

First during the running p

3条回答

暗喜 (楼主)

2021-02-03 12:27

For question 3, I believe the 0.008 (the "topic distribution") relates to the prior \alpha over topic distributions for documents. Mallet optimises this prior, essentially allowing some topics to carry more "weight". Mallet seems to be estimating that topic 0 accounts for a small proportion of your corpus.

The token counts represent only the words with highest counts. The remaining counts for topic 0 could, for example, be 0, and the remaining counts for topic 9 could be 3. Thus topic 9 can account for many more words in your corpus than topic 0, even though the counts for the top words are lower.

I'd have to check out the code for the "0 0.55" at the end, but that's probably the optimised \beta value (which I'm pretty sure isn't done asymetrically).

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题