Fast n-gram calculation

后端 未结 3 1740
谎友^
谎友^ 2020-12-02 12:47

I\'m using NLTK to search for n-grams in a corpus but it\'s taking a very long time in some cases. I\'ve noticed calculating n-grams isn\'t an uncommon feature in other pack

3条回答
  •  青春惊慌失措
    2020-12-02 13:19

    For character-level n-grams you could use the following function

    def ngrams(text, n):
        n-=1
        return [text[i-n:i+1] for i,char in enumerate(text)][n:] 
    

提交回复
热议问题