I\'m struggling with NLTK stopword.
Here\'s my bit of code.. Could someone tell me what\'s wrong?
from nltk.corpus import stopwords
def removeStopwo
Using a tokenizer first you compare a list of tokens (symbols) against the stoplist, so you don't need the re module. I added an extra argument in order to switch among languages.
def remove_stopwords(sentence, language):
return [ token for token in nltk.word_tokenize(sentence) if token.lower() not in stopwords.words(language) ]
Dime si te fue de util ;)