I have recently started using the nltk module for text analysis. I am stuck at a point. I want to use word_tokenize on a dataframe, so as to obtain all the words used in a p
I will show you an example. Suppose you have a data frame named twitter_df and you have stored sentiment and text within that. So, first I extract text data into a list as follows
tweetText = twitter_df['text']
then to tokenize
from nltk.tokenize import word_tokenize
tweetText = tweetText.apply(word_tokenize)
tweetText.head()
I think this will help you