发表新帖

发表新帖

Generating random sentences from custom text in Python's NLTK?

后端未结

关注

 5  809

借酒劲吻你 2020-12-24 03:53

I\'m having trouble with the NLTK under Python, specifically the .generate() method.

generate(self, length=100)

Print random text, generat

5条回答

Happy的楠姐 (楼主)

2020-12-24 04:35

You should be "training" the Markov model with multiple sequences, so that you accurately sample the starting state probabilities as well (called "pi" in Markov-speak). If you use a single sequence then you will always start in the same state.

In the case of Orwell's 1984 you would want to use sentence tokenization first (NLTK is very good at it), then word tokenization (yielding a list of lists of tokens, not just a single list of tokens) and then feed each sentence separately to the Markov model. This will allow it to properly model sequence starts, instead of being stuck on a single way to start every sequence.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题