Stanford-NER customization to classify software programming keywords

后端未结

关注

 2  461

攒了一身酷 2021-01-01 08:04

I am new in NLP and I used Stanford NER tool to classify some random text to extract special keywords used in software programming.

The problem is, I don\'t no how

2条回答

[愿得一人] (楼主)

2021-01-01 08:43
I think it is quite well documented in Stanford NER faq section http://nlp.stanford.edu/software/crf-faq.shtml#a.

Here are the steps:
- In your properties file change the map to specify how your training data is annotated (or structured)
map = word=0,myfeature=1,answer=2
- In src\edu\stanford\nlp\sequences\SeqClassifierFlags.java
  
  Add a flag stating that you want to use your new feature, let's call it useMyFeature Below public boolean useLabelSource = false , Add public boolean useMyFeature= true;
  
  In same file in setProperties(Properties props, boolean printProps) method after else if (key.equalsIgnoreCase("useTrainLexicon")) { ..} tell tool, if this flag is on/off for you
```
else if (key.equalsIgnoreCase("useMyFeature")) {
      useMyFeature= Boolean.parseBoolean(val);
}
```
- In src/edu/stanford/nlp/ling/CoreAnnotations.java, add following section
```
public static class myfeature implements CoreAnnotation {
  public Class getType() {
    return String.class;
  }
}
```
- In src/edu/stanford/nlp/ling/AnnotationLookup.java in public enumKeyLookup{..} in bottom add
  
  MY_TAG(CoreAnnotations.myfeature.class,"myfeature")
- In src\edu\stanford\nlp\ie\NERFeatureFactory.java, depending on the "type" of feature it is, add in
```
protected Collection featuresC(PaddedList cInfo, int loc)

if(flags.useRahulPOSTAGS){
    featuresC.add(c.get(CoreAnnotations.myfeature.class)+"-my_tag");
}
```
Debugging: In addition to this, there are methods which dump the features on file, use them to see how things are getting done under hood. Also, I think you would have to spend some time with debugger too :P
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...