Native TF vs Keras TF performance comparison
I created the exact same network with native and backend tensorflow but after many hours of testing using number of different parameters, still couldn't figure out why keras outperforms the native tensorflow and produces better(slightly but better) results. Does Keras implement a different weight initializer method? or performs different weight decay approach other than tf.train.inverse_time_decay? P.s. the score difference is always like Keras with Tensorflow: ~0.9850 - 0.9885 - ~45 sec. avg. training time for 1 epoch Tensorflow Native ~0.9780 - 0.9830 - ~23 sec. My environment is: Python 3.5