I am trying to follow an implementation of an attention decoder
The AttentionDecoder class is inherited from Recurrent, an abstract base cla
AttentionDecoder
Recurrent