How to take the suffix in smoothing of Part of speech tagging

删除回忆录丶 提交于 2019-12-11 18:52:47

问题


I am making a "Part of speech Tagger". I am handling the unknown word with the suffix.

But the main issue is that how would i decide the number of suffix... should it be pre-decided (like Weischedel approach) or I have to take the last few alphabets of the words(like Samuelsson approach).

Which approach would be better......


回答1:


Quick googling suggests that the Weischedel approach is sufficient for English, which has only rudimentary morphological inflection. The Samuelsson approach seems to work better (which makes sense intuitively) when it comes to processing inflecting languages.

A Resource-light Approach to Morpho-syntactic Tagging - Google Books p 9 quote:

To handle unknown words Brants (2000) uses Samuelsson's (1993) suffix analysis, which seems to work best for inflected languages.

(This is not in a direct comparison to Weischedel's approach, though.)



来源:https://stackoverflow.com/questions/25310485/how-to-take-the-suffix-in-smoothing-of-part-of-speech-tagging

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!