问题
I use NLTK with wordnet in my project. I did the installation manually on my PC, with pip:
pip3 install nltk --user in a terminal, then nltk.download() in a python shell to download wordnet.
I want to automatize these with a setup.py file, but I don't know a good way to install wordnet.
For the moment, I have this piece of code after the call to setup ("nltk" is in the install_requires list of the call to setup):
import sys
if 'install' in sys.argv:
import nltk
nltk.download("wordnet")
Is there a better way to do this?
回答1:
I managed to install the NLTK data in setup.py by overriding cmdclass with my own Install class :
from setuptools import setup, find_packages
from setuptools.command.install import install as _install
class Install(_install):
def run(self):
_install.do_egg_install(self)
import nltk
nltk.download("popular")
setup(...
cmdclass={'install': Install},
...
install_requires=[
'nltk',
],
setup_requires=['nltk']
...
)
It is important to use the method do_egg_install() in your run() method to make sure nltk gets installed, before import nltk is called (See also here python setuptools install_requires is ignored when overriding cmdclass). Also don't forget to add nltk to setup_requires.
回答2:
You can also automate installation with a shell script, for example, running (after pip installing nltk):
python -m nltk.downloader -d /usr/share/nltk_data wordnet
来源:https://stackoverflow.com/questions/26799894/installing-nltk-data-dependencies-in-setup-py-script