Stanford CRFClassifier performance evaluation output

做~自己de王妃 提交于 2019-12-11 18:33:43

问题


I'm following this FAQ https://nlp.stanford.edu/software/crf-faq.shtml for training my own classifier and I noticed that the performance evaluation output does not match the results (or at least not in the way I expect). Specifically this section

CRFClassifier tagged 16119 words in 1 documents at 13824.19 words per second. Entity P R F1 TP FP FN MYLABEL 1.0000 0.9961 0.9980 255 0 1 Totals 1.0000 0.9961 0.9980 255 0 1

I expect TP to be all instances where the predicted label matched the golden label, FP to be all instances where MYLABEL was predicted but the golden label was O, FN to be all instances where O was predicted but the golden was MYLABEL.

If I calculate those numbers myself from the output of the program, I get completely different numbers with no relation to what the program prints. I've tried this with various test files. I'm using Stanford NER - v3.7.0 - 2016-10-31

Am I missing something?


回答1:


The F1 scores are over entities not labels.

Example:

(Joe, PERSON) (Smith, PERSON) (went, O) (to, O) (Hawaii, LOCATION) (., O).

In this example there are two possible entities:

Joe Smith   PERSON
Hawaii      LOCATION

Entities are created by taking all adjacent tokens with the same label. (Unless you use a more complicated BIO labeling scheme ; BIO schemes have tags like I-PERSON and B-PERSON to indicate whether a token is the beginning of an entity, etc...).



来源:https://stackoverflow.com/questions/46940195/stanford-crfclassifier-performance-evaluation-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!