How to use Bert for long text classification?

前端 未结 6 388
野的像风
野的像风 2020-12-14 18:36

We know that bert has a max length limit of tokens = 512, So if an acticle has a length of much bigger than 512, such as 10000 tokens in text How can bert be used?

6条回答
  •  北海茫月
    2020-12-14 19:09

    There are two main methods:

    • Concatenating 'short' BERT altogether (which consists of 512 characters max)
    • Constructing a real long BERT (CogLTX, Blockwise BERT, Longformer, Big Bird)

    I resumed some typical papers of BERT for long text in this post : https://lethienhoablog.wordpress.com/2020/11/19/paper-dissected-and-recap-4-which-bert-for-long-text/

    You can have an overview of all methods there.

提交回复
热议问题