New to NLP, Question about annotation

纵然是瞬间 提交于 2019-12-24 02:48:11

问题


I am new to NLP and I am looking for a starting point, in terms of some tutorials, documentation or example code. I have been told to research the possibilities of processing natural text to extract some structured data from it. For example I want to extract(annotate) height and weight from following statements. "He is 6 feet tall and weighs 200 pounds" or "His height is 6 feet and weight is 200" etc. I have looked into UIMA but it seems like a self created REGEX dictionary with no training capabilities. So in a nutshell, what Java framework can I use to create an annotation engine that can be trained as well! Any help(pointers) on this will be heavily appreciated. Thanks


回答1:


If you really want to want to use machine learning to train your annotator, then GATE is probably your best bet. Take a look at the chapter on machine learning in their guide.




回答2:


Since you asked for pointers: LingPipe (already mentioned above), OpenNLP, and Stanford NLP distributions.

Note: if Python is an option, you can use the Natural Language Toolkit.




回答3:


I'd use NER. Here is the output I see for your input text:

You can try it here: http://deagol.cs.illinois.edu:8080



来源:https://stackoverflow.com/questions/4310303/new-to-nlp-question-about-annotation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!