How to create incremental NER training model(Appending in existing model)?

问题

I am training customized Named Entity Recognition(NER) model using stanford NLP but the thing is i want to re-train the model.

Example :

Suppose i trained xyz model , then i will test it on some text if model detected somethings wrong then i (end user) will correct it and wanna re-train(append mode) the model on the corrected text.

Stanford Doesn't provide re-training facility so thats why i shifted towards spacy library of python , where i can retrain the model means , i can append new entities into the existing model.But after re-training the model using spacy , it overriding the existing knowledge(means existing training data in it) and just showing the result related to recent training.

Consider , i trained a model on TECHNOLOGY tag using 1000 records.after that lets say i have added one more entity BOOK_NAME to existing trained model.after this if i test model then spacy model just detecting BOOK_NAME from text.

Please give a suggestion to tackle my problem statement.

Thanks in Advance...!

回答1:

I think it is a bit late to address this here. The issue you are facing is what is also called 'Catastrophic Forgetting problem'. You can get over it by sending in examples for existing examples. Like Spacy can predict well on well formed text like BBC corpus. You can choose such corpus, predict using pretrained model of spacy and create training examples. Mix these examples with your new examples and then train. You should now get better results. It was mentioned already in the spacy issues.

来源：https://stackoverflow.com/questions/46114476/how-to-create-incremental-ner-training-modelappending-in-existing-model

标签

machine-learning

stanford-nlp

spacy