I am trying to conduct tokenization of n-grams (between 1 (minimum) and 3(maximum)) on my data. After applying this function , I can see that it strips some relevant words s