Talking about the trend of dispensing either the Encoder or the Decoder part of the traditional Transformer (BERT and GPT3 respectively), I have trouble understanding