Suppose I want to use a BERT model, which takes in account the context of the words depending on the sentence, to perform a NER task. Suppose I have the following sentence i