问题
So I recently moved my NLP application over to a new machine. Added the same python environment with pyenv as the old machine and installed with pip all the dependencies. Then there was a 'dependency' of sorts that is not installed by pip, maybe 'model' is a better word for it. The command that installed it is:
python -m spacy.en.download
Now, I'm wanting to note that somewhere in my repository so if one day I or someone else goes to install the whole thing on another PC it's there, noted in accordance with Python style guides and conventions.
On this spaCy page it says that it can go in requirements.txt. While
pip freeze > requirements.txt
will create a file this won't capture the correct procedure to install that requirement. One day someone will run
pip install -r requirements.txt
..and will still run into the same error as I did
Warning: no model found for 'en'
Only loading the 'en' tokenizer.
Does anyone know how to correctly list this requirement in requirements.txt?
回答1:
spaCy's data packages are actually wrapped as pip packages for exactly this reason --- they have a setup.py, a version, etc. It's just that they're large and so aren't distributed via PyPi. You can point to a URL or a file-path in your requirements.txt, though:
https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz
Lots of production users host their own PyPi servers (so they're not downloading arbitrary code from the internet). You can distribute the models via a PyPi warehouse if you do that.
来源:https://stackoverflow.com/questions/44286663/what-to-do-with-non-pip-requirement-in-requirements-txt