I want to implement my own optimizers for tensorflow models. To begin with, I try to understand the implementation of the existing optimizers starting with a simple SGD. However