Neural Machine Translation model predictions are off-by-one

霸气de小男生 提交于 2019-12-04 14:36:20

The core issue with the NMT model used to predict a language-like syntax with a repetitive structure is that it becomes incentivized to simply predict whatever the past prediction was. Since it is fed the correct previous prediction at each step by TrainingHelper to speed up training, this artificially produces a local minimum that the model is unable to get out of.

The best option I have found is to weight the loss functions such the key points in the output sequence where the output is not repetitive are weighted more heavily. This will incentivize the model to get those correct, and not just repeat the past prediction.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!