问题
QUESTIONS
How do I load a custom properties file using AbstractSequenceClassifier? e.g.,
Master's Degree\tDEGREE
MBA\tDEGREE
What are the benefits/drawbacks of each approach?(AbstractSequenceClassifier vs NamedEntityTagAnnotation)
Is there any accessible documentation/tutorial on the internet. I can play with demo code and read javadocs, but a good tutorial would save me and many others a lot of time.
During my perusal of the Stanford NER documentation, I have encountered two java examples.
NamedEntityTagAnnotation
The first uses NamedEntityTagAnnotation. This allows me to add my own properties file for training data (using regexner.mapping).
The key code is as follows: Initialize Pipeline:
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, regexner, depparse, natlog, openie");
props.put("regexner.mapping", "mypath/mytraineddatacodes.properties");
pipeline = new StanfordCoreNLP(props);
Initialize document:
Annotation document = new Annotation(pass4);
pipeline.annotate(document);
Then access the NER tokens and any other data needed:
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
for (CoreMap sentence : sentences)
{
for (CoreLabel token : sentence.get(TokensAnnotation.class))
{
currNeToken = token.get(NamedEntityTagAnnotation.class);
String word = token.get(TextAnnotation.class);
}
}
AbstractSequenceClassifier
This is the method demonstrated in the Stanford NERDemo.java example. IT seems to provide much deeper access to the API, but I don't know how to load my customized properties file of trained data.
Initialize Classifier (which bi-passes the pipeline)
String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz";
AbstractSequenceClassifier classifier = CRFClassifier.getClassifierNoExceptions(serializedClassifier);
Load the file to analyze:
byte[] encoded = Files.readAllBytes(p);
String s = new String(encoded);
String fileContents = s;
List<List<CoreLabel>> out = classifier.classify(fileContents);
for (List<CoreLabel> sentence : out)
{
for (CoreLabel word : sentence)
{
Log.getLogger().debug(word.word() + '/' + word.get(AnswerAnnotation.class) + ' ');
}
System.out.println();
}
And your off to the races, except it hasn't loaded my custom properties file for trained data.
QUESTIONS
How do I load a custom properties file using AbstractSequenceClassifier? e.g.,
Master's Degree\tDEGREE
MBA\tDEGREE
What are the benefits/drawbacks of each method?
Is there any accessible documentation/tutorial on the internet. I can play with demo code and read javadocs, but a good tutorial would save me and many others a lot of time.
来源:https://stackoverflow.com/questions/37803277/stanford-ner-abstractsequenceclassifier-vs-namedentitytagannotation