different nltk results in django and at command line

故事扮演 提交于 2020-01-06 17:10:51

问题


I have a django 1.8 view that looks like this:

def sourcedoc_parse(request, sourcedoc_id):
    sourcedoc = Sourcedoc.objects.get(pk=sourcedoc_id)
    nltk.data.path.append('/root/nltk_data')
    new_words = []
    english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words())    #<---the line where the error occurs
    results = {}

    template = 'sourcedoc_parse.html'
    params = {'sourcedoc': sourcedoc,'results': results, 'new_words': new_words, 'BASE_URL': BASE_URL}

    return render_to_response(template, params, context_instance=RequestContext(request))

It gives me the following error:

Django Version: 1.8
Python Version: 2.7.6
...
Traceback:
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py" in get_response
132.                     response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/rosshartshorn/htdocs/worldmaker/sourcedocs/views.py" in sourcedoc_parse
107.     english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words())
File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py" in __getattr__
68.         self.__load()

File "/usr/local/lib/python2.7/dist-packages/nltk/corpus/util.py" in __load 56. except LookupError: raise e

Exception Type: LookupError at /sourcedoc/parse/13/
Exception Value: 
**********************************************************************
Resource 'corpora/gutenberg' not found.  Please use the NLTK
Downloader to obtain the resource:  >>> nltk.download()
Searched in:
- '/var/www/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/root/nltk_data'
**********************************************************************

What is especially odd is that it works fine when I do it in the same directory in the python shell, it works fine:

Python 2.7.6 (default, Mar 22 2014, 22:59:38) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> english_vocab = set(w.lower() for w in nltk.corpus.gutenberg.words())
>>> 'jabberwocky' in english_vocab
False
>>> 'monster' in english_vocab
True
>>> nltk.data.path
['/root/nltk_data', '/usr/share/nltk_data', '/usr/local/share/nltk_data', '/usr/lib/nltk_data', '/usr/local/lib/nltk_data']

Does anyone have an idea what is the source of the difference between running it inside a view in django, and doing the same thing at the python command line? I've done the same thing using 'python manage.py shell', and it also works that way.

Any debugging advice on finding the difference is also welcome.


回答1:


The problem here is that the user running django don't have permission to read at /root.

It does not happens when running django shell because you are running the shell as root, but the server is running as the www user (see, the first directory where nltk search is /var/www/nltk_data, the home dir for the www user).



来源:https://stackoverflow.com/questions/30608189/different-nltk-results-in-django-and-at-command-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!