huggingface-transformers

How should properly formatted data for NER in BERT look like?

别来无恙 提交于 2020-08-09 08:57:28
问题 I am using Huggingface's transformers library and want to perform NER using BERT. I tried to find an explicit example of how to properly format the data for NER using BERT. It is not entirely clear to me from the paper and the comments I've found. Let's say we have a following sentence and labels: sent = "John Johanson lives in Ramat Gan." labels = ['B-PER', 'I-PER', 'O', 'O', 'B-LOC', 'I-LOC'] Would data that we input to the model be something like this: sent = ['[CLS]', 'john', 'johan', '#