Here\'s an example to clarify what I mean:
First session.run():
First run of a TensorFlow session
Later session.run():
Later runs of a TensorFlow session
The tf.nn.conv_2d() op takes much longer to run on the first tf.Session.run() invocation because—by default—TensorFlow uses cuDNN's autotune facility to choose how to run subsequent convolutions as fast as possible. You can see the autotune invocation here.
There is an undocumented environment variable that you can use to disable autotune. Set TF_CUDNN_USE_AUTOTUNE=0
when you start the process running TensorFlow (e.g. the python
interpreter) to disable its use.