I have this example and i want to know how to get this result. I have text and I tokenize it then I collect the bigram and trigram and fourgram like that
im
I do it like this:
def words_to_ngrams(words, n, sep=" "): return [sep.join(words[i:i+n]) for i in range(len(words)-n+1)]
This takes a list of words as input and returns a list of ngrams (for given n), separated by sep (in this case a space).
sep