I am trying to compute the derivative of the activation function for softmax. I found this : https://math.stackexchange.com/questions/945871/derivative-of-softmax-loss-function
For what it's worth, here is my derivation based on SirGuy answer: (Feel free to point errors if you find any).