Why use: tanh(a + b) VS tanh(a) + tanh(b) and why use a single layered neural network layer

前端 未结 0 1005
情歌与酒
情歌与酒 2020-12-03 08:51

In this machine translation exercise:

First question:

why does this class BahdanauAttention (see class at end) do th

相关标签:
回答
  • 消灭零回复
提交回复
热议问题