My Code:
import nltk.data
tokenizer = nltk.data.load(\'nltk:tokenizers/punkt/english.pickle\')
ERROR Message:
[ec2-user@ip-
I got the solution:
import nltk
nltk.download()
Downloader> d
Download which package (l=list; x=cancel)? Identifier> punkt
For me it got solved by using "nltk:"
http://www.nltk.org/howto/data.html
Failed loading english.pickle with nltk.data.load
sent_tokenizer=nltk.data.load('nltk:tokenizers/punkt/english.pickle')
The same thing happened to me recently, you just need to download the "punkt" package and it should work.
When you execute "list" (l) after having "downloaded all the available things", is everything marked like the following line?:
[*] punkt............... Punkt Tokenizer Models
If you see this line with the star, it means you have it, and nltk should be able to load it.
I was getting an error despite importing the following,
import nltk
nltk.download()
but for google colab this solved my issue.
!python3 -c "import nltk; nltk.download('all')"
My issue was that I called nltk.download('all') as the root user, but the process that eventually used nltk was another user who didn't have access to /root/nltk_data where the content was downloaded.
So I simply recursively copied everything from the download location to one of the paths where NLTK was looking to find it like this:
cp -R /root/nltk_data/ /home/ubuntu/nltk_data
Just make sure you are using Jupyter Notebook and in a notebook, do the following:
import nltk
nltk.download()
Then one popup window will appear (showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml) From that you have to download everything.
Then rerun your code.