bert-language-model | 易学教程

Fine-tune Bert for specific domain (unsupervised)

阅读更多关于 Fine-tune Bert for specific domain (unsupervised)

问题 I want to fine-tune BERT on texts that are related to a specific domain (in my case related to engineering). The training should be unsupervised since I don't have any labels or anything. Is this possible? 回答1: What you in fact want to is continue pre-training BERT on text from your specific domain. What you do in this case is to continue training the model as masked language model, but on your domain-specific data. You can use the run_mlm.py script from the Huggingface's Transformers. 来源：

How to use Bert for long text classification?

阅读更多关于 How to use Bert for long text classification?

问题 We know that bert has a max length limit of tokens = 512, So if an acticle has a length of much bigger than 512, such as 10000 tokens in text How can bert be used? 回答1: You have basically three options: You cut the longer texts off and only use the first 512 Tokens. The original BERT implementation (and probably the others as well) truncates longer sequences automatically. For most cases, this option is sufficient. You can split your text in multiple subtexts, classifier each of them and

How to use Bert for long text classification?

阅读更多关于 How to use Bert for long text classification?

How to use Bert for long text classification?

阅读更多关于 How to use Bert for long text classification?

Sliding window for long text in BERT for Question Answering

阅读更多关于 Sliding window for long text in BERT for Question Answering

问题 I've read post which explains how the sliding window works but I cannot find any information on how it is actually implemented. From what I understand if the input are too long, sliding window can be used to process the text. Please correct me if I am wrong. Say I have a text "In June 2017 Kaggle announced that it passed 1 million registered users" . Given some stride and max_len , the input can be split into chunks with over lapping words (not considering padding). In June 2017 Kaggle

Sliding window for long text in BERT for Question Answering

阅读更多关于 Sliding window for long text in BERT for Question Answering

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

阅读更多关于 CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

问题 I got the following error when I ran my pytorch deep learning model in colab /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in linear(input, weight, bias) 1370 ret = torch.addmm(bias, input, weight.t()) 1371 else: -> 1372 output = input.matmul(weight.t()) 1373 if bias is not None: 1374 output += bias RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)` I even reduced batch size from 128 to 64 i.e., reduced to half, but still, I got this error

BERT-based NER model giving inconsistent prediction when deserialized

阅读更多关于 BERT-based NER model giving inconsistent prediction when deserialized

问题 I am trying to train an NER model using the HuggingFace transformers library on Colab cloud GPUs, pickle it and load the model on my own CPU to make predictions. Code The model is the following: from transformers import BertForTokenClassification model = BertForTokenClassification.from_pretrained( "bert-base-cased", num_labels=NUM_LABELS, output_attentions = False, output_hidden_states = False ) I am using this snippet to save the model on Colab import torch torch.save(model.state_dict(),

BERT-based NER model giving inconsistent prediction when deserialized

阅读更多关于 BERT-based NER model giving inconsistent prediction when deserialized

AttributeError: 'str' object has no attribute 'dim' in pytorch

阅读更多关于 AttributeError: 'str' object has no attribute 'dim' in pytorch

问题 I got the following error output in the PyTorch when sent model predictions into the model. Does anyone know what's going on? Following are the architecture model that I created, in the error output, it shows the issue exists in the x = self.fc1(cls_hs) line. class BERT_Arch(nn.Module): def __init__(self, bert): super(BERT_Arch, self).__init__() self.bert = bert # dropout layer self.dropout = nn.Dropout(0.1) # relu activation function self.relu = nn.ReLU() # dense layer 1 self.fc1 = nn.Linear