I am trying to remove stopwords from a string of text:
from nltk.corpus import stopwords
text = \'hello bye the the hi\'
text = \' \'.join([word for word in
Use a regexp to remove all words which do not match:
import re
pattern = re.compile(r'\b(' + r'|'.join(stopwords.words('english')) + r')\b\s*')
text = pattern.sub('', text)
This will probably be way faster than looping yourself, especially for large input strings.
If the last word in the text gets deleted by this, you may have trailing whitespace. I propose to handle this separately.