问题
I downloaded stanford NER 3.4.1, unpacked it, and tried to run named entity recognition on a local file using the default (provided) trained model. I got this:
`java.io.FileNotFoundException: /u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters (No such file or directory) at edu.stanford.nlp.io.IOUtils.inputStreamFromFile(IOUtils.java:481)`
What's wrong and how can I fix it?
回答1:
It turns out that the provided models use "distributional similarity features" that require a .clusters
file at a location specified in the compressed model file (tricky to change). If you're on the stanford network, presumably the required files are there. If not, I found two choices:
- Download stanford NER without the distributional similarity features (slightly degrades performance, but runs faster). disclaimer: I havn't actually tried this, but it should work.
- Download the distsim file (look here) from stanford and create a sym-link to it so it appears to be in the correct location. In my case on a Mac, I did this:
- I created a heirarchy of folders
u/nlp/data/pos_tags_are_useless/
somewhere, - copied the downloaded
egw4-reut.512.clusters
file there, - then
cd /; sudo ln -s <somewhere>/u
.
- I created a heirarchy of folders
回答2:
This was an error in the model files accompanying the initial release of the v3.4.1 code, and has been fixed. Re-download and all should run fine, without requiring the symlink workaround.
来源:https://stackoverflow.com/questions/25569466/stanford-ner-tagger-generates-file-not-found-exception-with-provided-models