Generating random sentences from custom text in Python's NLTK?

后端 未结 5 809
借酒劲吻你
借酒劲吻你 2020-12-24 03:53

I\'m having trouble with the NLTK under Python, specifically the .generate() method.

generate(self, length=100)

Print random text, generat

5条回答
  •  Happy的楠姐
    2020-12-24 04:35

    You should be "training" the Markov model with multiple sequences, so that you accurately sample the starting state probabilities as well (called "pi" in Markov-speak). If you use a single sequence then you will always start in the same state.

    In the case of Orwell's 1984 you would want to use sentence tokenization first (NLTK is very good at it), then word tokenization (yielding a list of lists of tokens, not just a single list of tokens) and then feed each sentence separately to the Markov model. This will allow it to properly model sequence starts, instead of being stuck on a single way to start every sequence.

提交回复
热议问题