BertTokenizer not giving correct offset values for string having unicode characters

后端 未结 0 1463
忘掉有多难
忘掉有多难 2020-12-15 02:49

I am working on the SQUAD 1.1 tfds dataset for a project that uses BERT. I needed offsets for each wordpiece token and thus decided to use the BertTokenizer class from tenso

相关标签:
回答
  • 消灭零回复
提交回复
热议问题