cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

谁说胖子不能爱 提交于 2020-07-04 13:21:08

问题


I get the following error when l run tensorflow in GPU.

2018-09-15 18:56:51.011724: E tensorflow/core/common_runtime/direct_session.cc:158] Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
  File "evaluate_sample.py", line 160, in <module>
    tf.app.run(main)
  File "/anaconda3/envs/tf/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "evaluate_sample.py", line 123, in main
    with tf.Session() as sess:
  File "/anaconda3/envs/tf/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1494, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/anaconda3/envs/tf/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 626, in __init__
    self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Where do the following errors come from ?

E tensorflow/core/common_runtime/direct_session.cc:158] Internal: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

and tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Such tha my version of :

tensorflow is : 1.10

cat /proc/driver/nvidia/version

NVRM version: NVIDIA UNIX x86_64 Kernel Module 390.77 Tue Jul 10 18:28:52 PDT 2018

GCC version: gcc version 7.3.0 (Debian 7.3.0-28)

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver

Copyright (c) 2005-2016 NVIDIA Corporation

Built on Sun_Sep__4_22:14:01_CDT_2016

Cuda compilation tools, release 8.0, V8.0.44


回答1:


The reason for this error is the mismatch of your installed Cuda Toolkit version and the version of the python package cudatoolkit, which is usually installed as dependency of tensorflow-gpu.

In order to fix this you have to first match your tensorflow version with your installed Cuda Toolkit version like shown here

Then you have to check the version of your cudatoolkit package. This have to match major and minor version, so e.g. if you have Cuda Toolkit 9.0 installed and cudatoolkit9_1 is installed you need to downgrade to cudatoolkit9 via your python.




回答2:


In the case I just solved, it was updating the GPU driver to the latest and installing the cuda toolkit. Your error is telling you your CUDA driver version is too old. I believe the nvcc version we were seeing was 7.5, and you have 7.3.

I think all you will have to do is: sudo apt install nvidia-cuda-toolkit then reboot.

Below are the steps I took for the problem where the libcuda.so.1 file could not be found.

First, the ppa was added and a newer GPU driver installed:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390

After adding the ppa, it showed options for driver versions, and 390 was the latest 'stable' version that was shown.

Then install the cuda toolkit:

sudo apt install nvidia-cuda-toolkit

Then reboot:

sudo reboot

It updated the drivers to a newer version than the 390 originally installed in the first step (it was 410; this was a p2.xlarge instance on AWS).




回答3:


Updating nvidia driver solved this issue.

You can check your cuda toolkit compatiblity here. Then update your nvidia driver by downloading it from here.




回答4:


Just update your nvidia drivers and it will solve the issue




回答5:


Same problem. Solved updating nvidia driver, because a I was using tensorflow 2.1 and it requires updated driver. Soo, I was using 390 and updated to 435, through Ubuntu's software manager.




回答6:


For Ubuntu 18.04 and Tensorflow 1.13.1

First make sure system is up to data:

sudo apt update
sudo apt dist-upgrade
sudo reboot now

Install later drivers:

sudo add-apt-repository ppa:graphics-drivers/ppa

Open Software & Updates and select the Additional Drivers tab:

Select the nvidia-driver-396 and click Apply Changes

Now reboot:

sudo reboot now

To verify which that NVIDIA driver 396 active:

nvidia-smi


来源:https://stackoverflow.com/questions/52346957/cudagetdevice-failed-status-cuda-driver-version-is-insufficient-for-cuda-run

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!