I am trying to implement a simple Sequence-to-Sequence model with Luong global attention. The problem is that I am looking at a specific implementation of the model, and I c