Evaluation in a Spacy NER model

前端未结

关注

 1  747

甜味超标 2020-12-24 09:09

I am trying to evaluate a trained NER Model created using spacy lib. Normally for these kind of problems you can use f1 score (a ratio between precision and recall). I coul

1条回答

伪装坚强ぢ (楼主)

2020-12-24 09:24

You can find different metrics including F-score, recall and precision in spaCy/scorer.py.

This example shows how you can use it:

import spacy
from spacy.gold import GoldParse
from spacy.scorer import Scorer

def evaluate(ner_model, examples):
    scorer = Scorer()
    for input_, annot in examples:
        doc_gold_text = ner_model.make_doc(input_)
        gold = GoldParse(doc_gold_text, entities=annot)
        pred_value = ner_model(input_)
        scorer.score(pred_value, gold)
    return scorer.scores

# example run

examples = [
    ('Who is Shaka Khan?',
     [(7, 17, 'PERSON')]),
    ('I like London and Berlin.',
     [(7, 13, 'LOC'), (18, 24, 'LOC')])
]

ner_model = spacy.load(ner_model_path) # for spaCy's pretrained use 'en_core_web_sm'
results = evaluate(ner_model, examples)

The scorer.scores returns multiple scores. When running the example, the result looks like this: (Note the low scores occuring because the examples classify London and Berlin as 'LOC' while the model classifies them as 'GPE'. You can figure this out by looking at the ents_per_type.)

{'uas': 0.0, 'las': 0.0, 'las_per_type': {'attr': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'root': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'compound': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'nsubj': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'dobj': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'cc': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'conj': {'p': 0.0, 'r': 0.0, 'f': 0.0}}, 'ents_p': 33.33333333333333, 'ents_r': 33.33333333333333, 'ents_f': 33.33333333333333, 'ents_per_type': {'PERSON': {'p': 100.0, 'r': 100.0, 'f': 100.0}, 'LOC': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'GPE': {'p': 0.0, 'r': 0.0, 'f': 0.0}}, 'tags_acc': 0.0, 'token_acc': 100.0, 'textcat_score': 0.0, 'textcats_per_cat': {}}

The example is taken from a spaCy example on github (link does not work anymore). It was last tested with spacy 2.2.4.

0 讨论(0)