How to download all nltk data in google cloud app engine?

假如想象 提交于 2021-01-28 08:31:20

问题


I have a django application which I have deployed using below link,

https://cloud.google.com/python/django/flexible-environment

But as I am using nltk for text processing, I am getting below error.

*********************************************************************
  Resource 'taggers/maxent_treebank_pos_tagger/PY3/english.pickle'
  not found.  Please use the NLTK Downloader to obtain the
  resource:  >>> nltk.download()
  Searched in:
    - '/root/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''

So I know that I am missing data from nltk. I have looked tons of code online but there is no way to download data in google app engine. Below is my requirement.txt for your reference.

Django==1.10.6
gunicorn==19.7.0
nltk==3.0.5

Please let me know if there is a way to do it. Thanks in advance.


回答1:


I did a workaround for getting the nltk data. Firstly I copied required nltk data files into my Django app folder. In settings.py, to access that folder I create one variable.

nltk_dir = os.path.join(BASE_DIR,'first_app','nltk_data')

Then referred this directory variable where I am using nltk.data.path.append() So it basically appends to the list of the path in data.py in nltk.

url = settings.nltk_dir
nltk.data.path.append(url)

Hence, I am able to retrieve nltk data.:)



来源:https://stackoverflow.com/questions/44187168/how-to-download-all-nltk-data-in-google-cloud-app-engine

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!