NER is over writing the custom NERin stanford NLP

坚强是说给别人听的谎言 提交于 2019-12-08 13:02:45

问题


In the stanford nlp, I used a pattern to match the phone number in regexner. But the NER is over writing it as Number.

If I remove the ner annotation then it is showing as PHONE_NUMBER. Can any one of you please help me.

Thanks in Advance.

Here is my regexner line:

^(?:(?:\+|0{0,2})91(\s*[\-]\s*)?|[0]?)?[789]\d{9}$  PHONENUMBER

回答1:


java command:

java -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner -file phone-number-example.txt -outputFormat text -ner.fine.regexner.mapping phone-number-regex.rules

example text:

I will call him at 555-555-5555

format of rules file:

555-555-5555    PHONE_NUMBER    NUMBER  1

(note the columns are tab delimited)

The fine-grained NER will be applied after the statistical NER. You can also build a custom regexner and run it after the statistical model. The key is telling it to overwrite the NUMBER tag (which is indicated in the third column).




回答2:


^(?:(?:\+|0{0,2})91(\s*[\-]\s*)?|[0]?)?[789]\d{9}$  PHONENUMBER NUMBER

this worked the column after the CUSTOM NER column to overwrite



来源:https://stackoverflow.com/questions/50678418/ner-is-over-writing-the-custom-nerin-stanford-nlp

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!