I\'ve installed the latest nvidia drivers (375.26) manually, and installed CUDA using cuda_8.0.44_linux.run (skipping the driver install there, since the bundled drivers are
I have followed the instructions on this page, and it works for me.
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=debnetwork
First, download installer for Linux Ubuntu 16.04 x86_64.
Next, follow these steps to install Linux Ubuntu:
sudo dpkg -i cuda-repo-ubuntu1604_9.2.148-1_amd64.deb
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
First, check "CUDA Toolkit and Compatible Driver Versions" from here, and make sure that your cuda toolkit version is compatible with your cuda-driver version, e.g. if your driver version is nvidia-390
, your cuda version must lower than CUDA 9.1
.
Then, back to this issue. This issue is caused by "your cuda-driver version doesn't match your cuda version, and your CUDA local version may also different from the CUDA runtime version(cuda version in some specific virtual environments)."
I met the same issue when I tried to run tensorflow-gpu under the environment of "tensorflow_gpuenv" created by conda, and tried to test whether the "gpu:0" device worked. My driver version is nvidia-390
and I've already install cuda 9.0
, so it doesn't make sense that raising that weird issue. I finally found the reason that the cuda version in the conda virtual environment is cuda 9.2
which isn't compatible with nvidia-390
. I solved the issue by following steps in ubuntu 18.04
:
~$ nvidia-smi
or ~$ cat /proc/driver/nvidia/version
~$ nvcc --version
or ~$ cat /usr/local/cuda/version.txt
check local cudnn version
~$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
check cuda version in virtual environment
~$ conda list
you can see something like these :
cudatoolkit 9.2 0
cudnn 7.3.1 cuda9.2_0
you may find that the cuda version in virtual environment is different from the local cuda version, and isn't compatible with driver version nvidia-390
.
So reinstall cuda in the virtual environment:
~$ conda install cudatoolkit=8.0
My cent,
the problem may be related to the selected GPU mode (Performance/Power Saving Mode). The Perfomance mode uses the Nvidia GPU, and the Power Saving Mode changes to the Intel Integrated GPU. When you select (using nvidia-settings
utility, in the "PRIME Profiles" configurations) the Power Saving Mode (integrated Intel GPU) and you execute the deviceQuery
script... you get this error:
-> CUDA driver version is insufficient for CUDA runtime version
But this error is misleading, by selecting back the Performance Mode (NVIDIA GPU) with nvidia-settings utility the problem disappears.
In my case I had not a driver version problem but I simply need to re-enable the Nvidia GPU.
Regards
P.s: The selection is available when Prime-related-stuff is installed (you need the Nvidia proprietary driver). Further details: https://askubuntu.com/questions/858030/nvidia-prime-in-nvidia-x-server-settings-in-16-04-1
During my experiments(Ubuntu 18.04 LTS - on Thinkpad 470s - NVIDIA GeForce 940MX), what I learned is that Table 2. CUDA Toolkit and Compatible Driver Versions from this release notes section of the website (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html) is the most important information that you need to keep in mind while installing CUDA drivers.
Also, you can compare whether your NVIDIA Driver version and the table is in Sync by checking with the command
$nvidia-smi
The output will look something like this for the latest
NVIDIA-SMI 450.66 Driver Version: 450.66 CUDA Version: 11.0
Once everything is in place, you can get it working by copying the samples and run them as follows,
$ cp -r /usr/src/cudnn_samples_v8/ .
$ cd cudnn_samples_v8/
$ cd mnistCUDNN/
$ make clean && make
$ ./mnistCUDNN
You'll get the results as ...
Executing: mnistCUDNN cudnnGetVersion() : 8003 , CUDNN_VERSION from cudnn.h : 8003 (8.0.3) Host compiler version : GCC 9.3.0
There are 1 CUDA capable devices on your machine : device 0 : sms 3 Capabilities 5.0, SmClock 1189.0 Mhz, MemSize (Mb) 2004, MemClock 2505.0 Mhz, Ecc=0, boardGroupID=0 Using device 0 ..........
Resulting weights from Softmax: 0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 Loading image data/five_28x28.pgm Performing forward propagation ... Resulting weights from Softmax: 0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
I had to try at least 5-6 times before I realized the correlation between NVIDIA drivers and CUDA versions and then it all worked in the next attempt. But a very happy ending nonetheless to get it working.
With reference to the answer of #Fabiano-Tarlao, if you already have installed the required NVidia driver, you can select it from the Linux command-line using:
sudo prime-select nvidia
Running
sudo apt-get purge nvidia-*
and reinstalling the drivers using
sudo apt-get install nvidia-375
solved it. Just for the record, the first time I updated the drivers using the GUI (Additional Drivers tab in Software & Updates).