Error converting Pegasus to the ONNX format from Transformers

◇◆丶佛笑我妖孽 提交于 2021-02-10 14:21:55

问题


I am trying to convert the Pegasus newsroom in HuggingFace's transformers model to the ONNX format. I followed this guide published by Huggingface. After installing the prereqs, I ran this code:

!rm -rf onnx/
from pathlib import Path
from transformers.convert_graph_to_onnx import convert

convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)

and got these errors:

ValueError                                Traceback (most recent call last)
<ipython-input-9-3b37ed1ceda5> in <module>()
      3 from transformers.convert_graph_to_onnx import convert
      4 
----> 5 convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11)
      6 
      7 

6 frames
/usr/local/lib/python3.6/dist-packages/transformers/models/pegasus/modeling_pegasus.py in forward(self, input_ids, attention_mask, encoder_hidden_states, encoder_attention_mask, head_mask, encoder_head_mask, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
    938             input_shape = inputs_embeds.size()[:-1]
    939         else:
--> 940             raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")
    941 
    942         # past_key_values_length

ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

I have never seen this error before. Any ideas?


回答1:


Pegasus is a seq2seq model, you can't directly convert a seq2seq model (encoder-decoder model) using this method. The guide is for BERT which is an encoder model. Any only encoder or only decoder transformer model can be converted using this method.

To convert a seq2seq model (encoder-decoder) you have to split them and convert them separately, an encoder to onnx and a decoder to onnx. you can follow this guide (it was done for T5 which is also a seq2seq model)

Why are you getting this error?

while converting PyTorch to onnx

_ = torch.onnx._export(
                        model,
                        dummy_input,
                        ...
                       )

you need to provide a dummy variable to both encoder and to the decoder separately. by default when converting using this method it provides the encoder the dummy variable. Since this method of conversion didn't accept decoder of this seq2seq model, it won't give a dummy variable to the decoder and you get the above error. ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds



来源:https://stackoverflow.com/questions/66109084/error-converting-pegasus-to-the-onnx-format-from-transformers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!