Definition of the CESS_ESP tags

狂风中的少年 提交于 2020-01-14 14:31:53

问题


I'm using the NLTK CESS ESP data package and I've been able to use an adatpation of the spaghetti tagger and a HiddenMarkovModelTagger to pos-tag the sentence, how ever the tags that it produces are not at all like the ones used when tagging en_US sentences, here's a link to the Categorizing and Tagging documentation for NLTK, you'll notice that the tags used are uppercase and don't have any numbers or punctuation, some cess tags: vsip3s0, da0fs0.

Does some one know a reference that explains those tags?

Sentence

¿Que es la programación orientada a objetos?

Spaghetti Tagger

[('\xc2\xbfQue', None), ('es', 'vsip3s0'), ('la', 'da0fs0'), ('programaci\xc3\xb3n', None), ('orientada', 'aq0fsp'), ('a', 'sps00'), ('objetos', 'ncmp000'), ('?', 'Fit')]
[('\xc2\xbfQue', None), ('es', None), ('la', None), ('programaci\xc3\xb3n', None), ('orientada', None), ('a', None), ('objetos', None), ('?', None)]
[('\xc2\xbfQue', None), ('es', 'vsip3s0'), ('la', 'da0fs0'), ('programaci\xc3\xb3n', None), ('orientada', 'aq0fsp'), ('a', 'sps00'), ('objetos', 'ncmp000'), ('?', 'Fit')]
[('\xc2\xbfQue', None), ('es', 'vsip3s0'), ('la', 'da0fs0'), ('programaci\xc3\xb3n', None), ('orientada', 'aq0fsp'), ('a', 'sps00'), ('objetos', 'ncmp000'), ('?', 'Fit')]

Markov Tagger

[('\xc2\xbfQue', 'sn.e-SUJ'), ('es', 'vsip3s0'), ('la', 'da0fs0'), ('programaci\xc3\xb3n', 'ncfs000'), ('orientada', 'aq0fsp'), ('a', 'sps00'), ('objetos', 'ncmp000'), ('?', 'Fit')]

回答1:


The cess-esp corpus is tagged using an old annotation system named EAGLE you can see it here. Hope this helps.



来源:https://stackoverflow.com/questions/25256878/definition-of-the-cess-esp-tags

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!