问题
I'm doubting whether tensorflow is correctly configured on my gpu box, since it's about 100x slower per iteration to train a simple linear regression model (batchsize = 32, 1500 input features, 150 output variables) on my fancy gpu machine than on my laptop.
I'm using a Titan X, with a modern cpu, etc. nvidia-smi says that I'm only at 10% gpu utilization, but I expect that's because of the small batchsizes. I'm not using a feed_dict to move data into the computation graph. Everything is coming via a tf.decode_csv and tf.train.shuffle_batch.
Does anyone have any recommendations for how to easily test whether my install is correct? Are there any simple speed benchmarks? The speed difference between my laptop and the gpu machine is so dramatic that I'm expecting that things aren't configured properly.
回答1:
Try tensorflow/tensorflow/models/image/mnist/convolutional.py, that'll print per-step timing.
On Tesla K40c that should get about 16 ms per step, while about 120 ms for CPU-only on my 3 year old machine
Edit: This moved to the models repositories: https://github.com/tensorflow/models/blob/master/tutorials/image/mnist/convolutional.py.
The convolutional.py file is now at models/tutorials/image/mnist/convolutional.py
回答2:
Extending Yaroslavs answer: Here is how to do the entire testing process (CUDA and cudNN installed already)
git clone https://github.com/tensorflow/models.git
Create a Virtual Environment for tensorflow and install tensorflow
virtualenv --system-site-packages -p python3 tf-venv3
source tf-venv3/bin/activate
pip install --upgrade pip
pip install --upgrade tensorflow-gpu
Run the model within your Virtual Environment
python models/tutorials/image/mnist/convolutional.py
My GTX 1070 needs ~5ms per step
Note: On Geforce 1050 Ti it takes ~10ms per step
来源:https://stackoverflow.com/questions/35703201/speed-benchmark-for-testing-tensorflow-install