Using NLTK corpora with AWS Lambda functions in Python

后端 未结 4 1800
遥遥无期
遥遥无期 2021-01-02 02:20

I\'m encountering a difficulty when using NLTK corpora (in particular stop words) in AWS Lambda. I\'m aware that the corpora need to be downloaded and have done so with NLTK

4条回答
  •  梦毁少年i
    2021-01-02 02:51

    Another solution is to use Lambda's ephemeral storage at the location /tmp

    So, you would have something like this:

    import nltk
    import json
    from nltk.tokenize import word_tokenize
    
    nltk.data.path.append("/tmp")
    
    nltk.download("punkt", download_dir = "/tmp")
    

    At runtime punkt will download to the /tmp directory, which is writable. However, this likely isn't a great solution if you have huge concurrency.

提交回复
热议问题