So I have a very simple NN script written in Tensorflow, and I am having a hard time trying to trace down where some \"randomness\" is coming in from.
I have record
The tensorflow reduce_sum op is specifically known to be non-deterministic. Furthermore, reduce_sum is used for calculating bias gradients.
This post discusses a workaround to avoid using reduce_sum (ie taking the dot product of any vector w/ a vector of all 1's is the same as reduce_sum)