Spanish POS tagging with Stanford NLP - is it possible to get the person/number/gender?

隐身守侯 提交于 2019-12-07 02:23:26

Why does Stanford NLP only use a reduced version of the Ancora tag?

This was a practical decision made to ensure high tagging accuracy. (Retaining morphological information on tags caused the entire tagger to suffer from data sparsity, and do worse not only on morphological annotation but all over the board.)

Is it possible to get the entire tag using Stanford NLP?

No. You could get quite far doing this with a simple rule-based system, though, or use the Stanford Classifier to train your own morphological annotator. (Feel free to share your code if you pick either path!)

If it is not strict to only using the Stanford POS tagger, you might want to try the POS and morphological tagging toolkit RDRPOSTagger. RDRPOSTagger supports pre-trained POS and morphological tagging to 40 different languages, including Spanish.

For Spanish POS and morphological tagging, RDRPOSTagger was trained using the IULA Spanish LSP Treebank. RDRPOSTagger then obtained a tagging accuracy of 97.95% with the tagging speed at 200K words/second in Java implementation (10K words/second in Python implementation), using a computer of Window7 OS 64-bit core i5 2.50GHz CPU and 6GB of memory.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!