问题
Possible Duplicate:
Java : Is there a good natural language processing library
Can anybody tell me about a library for NLP in java? It would really be nice if is properly documented too. I have tried to work with lingpipe but I am not able to understand it completely.
回答1:
You should try the Stanford - NLP. It has many utilities and libraries for NLP like the Parts-Of-Speech Tagger,all of which are great to use and easy to understand.
回答2:
It is probably a bit late now, and I suppose you must have moved on with your project. But you can anyways check this blog out. It has a series of posts on nlp with java. Stanford-NLP, as suggested by others is a great library to work with.
Most of the libraries will help you in the lexical analysis phase (sentence segmentation, tokenization, POS tagging, parsing, etc.) so that you don't have to start your code from scratch. All the best !
回答3:
These are 2 other libraries for NLP which you can use.
- OpenNLP
- Stanford NLP
Stanford NLP Group has an effective combination of sophisticated and deep linguistic modeling and data analysis with innovative probabilistic and machine learning approaches to NLP.
回答4:
I've done some experimenting with Apache OpenNLP with jRuby. Its quite nice and solid but at the time I write this poorly documented. If you try OpenNLP I suggest you read following articles:
- Getting started with OpenNLP (Natural Language Processing)
- Mining Wikipedia with Hadoop and Pig for Natural Language Processing
- OpenNLP Tutorial
- An UIMA Sentence Annotator using OpenNLP
Documentation for OpenNLP can be found here.
This is code from my project where I do named entity recognition with OpenNLP. Its written in jRuby. OpenNLP models are stored in database because the code runs on Heroku and you can't write on file system there.
- Politiki Named Entity Recognition API for w/ OpenNLP, jRuby and Grape
回答5:
There is actually a quiet good NLP tool list. It's in german, but should work with google translate. But i list some nevertheless:
- Mate tools (GPL V2)
- OpenNLP (Apache License V2)
- Stanford NLP (dual licensed, GPL V2)
- TreeTagger
If you want the best for english, take Stanford, but its GPL v2. For not such popular languages, Treetagger is better (it just needs a smaller trainingcorpus to work). For example you get better results with TreeTagger on german texts, dont know the survey anymore, but if you want it, i can search for it. OpenNLP is not as good as the other tools, but its under the Apache License v2, which you should consider aswell.
来源:https://stackoverflow.com/questions/11116390/natural-language-processing-in-java-nlp