I need to write a program in NLTK that breaks a corpus (a large collection of txt files) into unigrams, bigrams, trigrams, fourgrams and fivegrams. I have already written co
maybe it helps. see link
import spacy nlp_en = spacy.load("en_core_web_sm") [x.text for x in doc]