Stanford-NER customization to classify software programming keywords

后端 未结 2 455
攒了一身酷
攒了一身酷 2021-01-01 08:04

I am new in NLP and I used Stanford NER tool to classify some random text to extract special keywords used in software programming.

The problem is, I don\'t no how

2条回答
  •  [愿得一人]
    2021-01-01 08:43

    I think it is quite well documented in Stanford NER faq section http://nlp.stanford.edu/software/crf-faq.shtml#a.

    Here are the steps:

    • In your properties file change the map to specify how your training data is annotated (or structured)

    map = word=0,myfeature=1,answer=2

    • In src\edu\stanford\nlp\sequences\SeqClassifierFlags.java

      Add a flag stating that you want to use your new feature, let's call it useMyFeature Below public boolean useLabelSource = false , Add public boolean useMyFeature= true;

      In same file in setProperties(Properties props, boolean printProps) method after else if (key.equalsIgnoreCase("useTrainLexicon")) { ..} tell tool, if this flag is on/off for you

      else if (key.equalsIgnoreCase("useMyFeature")) {
            useMyFeature= Boolean.parseBoolean(val);
      }
      
    • In src/edu/stanford/nlp/ling/CoreAnnotations.java, add following section

      public static class myfeature implements CoreAnnotation {
        public Class getType() {
          return String.class;
        }
      }
      
    • In src/edu/stanford/nlp/ling/AnnotationLookup.java in public enumKeyLookup{..} in bottom add

      MY_TAG(CoreAnnotations.myfeature.class,"myfeature")

    • In src\edu\stanford\nlp\ie\NERFeatureFactory.java, depending on the "type" of feature it is, add in

      protected Collection featuresC(PaddedList cInfo, int loc)
      
      if(flags.useRahulPOSTAGS){
          featuresC.add(c.get(CoreAnnotations.myfeature.class)+"-my_tag");
      }
      

    Debugging: In addition to this, there are methods which dump the features on file, use them to see how things are getting done under hood. Also, I think you would have to spend some time with debugger too :P

提交回复
热议问题