Python NLTK: Bigrams trigrams fourgrams

前端 未结 4 1162
被撕碎了的回忆
被撕碎了的回忆 2020-12-25 14:16

I have this example and i want to know how to get this result. I have text and I tokenize it then I collect the bigram and trigram and fourgram like that

im         


        
4条回答
  •  悲哀的现实
    2020-12-25 15:20

    I do it like this:

    def words_to_ngrams(words, n, sep=" "):
        return [sep.join(words[i:i+n]) for i in range(len(words)-n+1)]
    

    This takes a list of words as input and returns a list of ngrams (for given n), separated by sep (in this case a space).

提交回复
热议问题