After building TensorFlow from source, seeing libcudart.so and libcudnn errors

匿名 (未验证) 提交于 2019-12-03 08:28:06

问题:

I'm building TensorFlow from source code. The build appears to succeed; however, when my TensorFlow program invokes import tensorflow, one or both of the following errors appear:

  • ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
  • ImportError: libcudnn.5: cannot open shared object file: No such file or directory

回答1:

First, for the following error:

ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

make sure your LD_LIBRARY_PATH includes your lib64 directory in whichever path you installed your cuda package in. You can do this by adding an export line in your .bashrc. For Omar, it looked like the following:

I fixed this just adding the cuda path to my .bashrc

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/


For me, I had to do Omar's line and also: export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/ because I have two directories involving cuda (probably not the best).


Second, are you sure you installed cuDNN? Note that this is different from the regular cuda package. You will need to register, then download and install the package from the following page: https://developer.nvidia.com/cudnn


Third, I had this same problem:

ImportError: libcudnn.5: cannot open shared object file: No such file or directory

It turns out there is no libcudnn.5 in my /usr/local/cuda/lib64 or /usr/local/cuda-8.0/lib64 directories. However, I do have a libcudnn.so.6.* file. To solve the problem, I created a soft link:

ln -s libcudnn.so.6.* libcudnn.so.5

in my /usr/local/cuda/lib64 directory. Now everything works for me. Your directory might be different if you already had cuDNN, and your libcudnn.so.6.* might be a different version, so check that.



回答2:

I came across the same issue

In [1]: import tensorflow --------------------------------------------------------------------------- ImportError                               Traceback (most recent call last) /usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in ()      40     sys.setdlopenflags(_default_dlopen_flags | ctypes.RTLD_GLOBAL) ---> 41   from tensorflow.python.pywrap_tensorflow_internal import *      42   from tensorflow.python.pywrap_tensorflow_internal import __version__  /usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in ()      27             return _mod ---> 28     _pywrap_tensorflow_internal = swig_import_helper()      29     del swig_import_helper  /usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()      23             try: ---> 24                 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)      25             finally:  /usr/local/lib/python3.5/imp.py in load_module(name, file, filename, details)     241         else: --> 242             return load_dynamic(name, filename, file)     243     elif type_ == PKG_DIRECTORY:  /usr/local/lib/python3.5/imp.py in load_dynamic(name, path, file)     341             name=name, loader=loader, origin=path) --> 342         return _load(spec)     343  ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory  During handling of the above exception, another exception occurred:  ImportError                               Traceback (most recent call last)  in () ----> 1 import tensorflow  /usr/local/lib/python3.5/site-packages/tensorflow/__init__.py in ()      22      23 # pylint: disable=wildcard-import ---> 24 from tensorflow.python import *      25 # pylint: enable=wildcard-import      26  /usr/local/lib/python3.5/site-packages/tensorflow/python/__init__.py in ()      49 import numpy as np      50 ---> 51 from tensorflow.python import pywrap_tensorflow      52      53 # Protocol buffers  /usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in ()      50 for some common reasons and solutions.  Include the entire stack trace      51 above this error message when asking for help.""" % traceback.format_exc() ---> 52   raise ImportError(msg)      53      54 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long  ImportError: Traceback (most recent call last):   File "/usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in      from tensorflow.python.pywrap_tensorflow_internal import *   File "/usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in      _pywrap_tensorflow_internal = swig_import_helper()   File "/usr/local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper     _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)   File "/usr/local/lib/python3.5/imp.py", line 242, in load_module     return load_dynamic(name, filename, file)   File "/usr/local/lib/python3.5/imp.py", line 342, in load_dynamic     return _load(spec) ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory   Failed to load the native TensorFlow runtime.  See https://www.tensorflow.org/install/install_sources#common_installation_problems  for some common reasons and solutions.  Include the entire stack trace above this error message when asking for help.

I have installed cudnn 6.0 while it needs libcudnn.so.5, apparently it couldn't find libcudnn.so.5. It seems that your tensorflow needs cudnn 5.x, so install cudnn 5.x

Make sure you have already installed cuda 8.0 and exported the PATH and LD_LIBRARY_PATH

To install cudnn 5.x, try the following commands

Extract tgz files

$ tar -zxvf cudnn-8.0-linux-x64-v5.1.tgz

Check the files

$ cd cuda/lib64/ $ ls -l total 150908 lrwxrwxrwx 1 doom doom       13 Nov  7  2016 libcudnn.so -> libcudnn.so.5 lrwxrwxrwx 1 doom doom       18 Nov  7  2016 libcudnn.so.5 -> libcudnn.so.5.1.10 -rwxr-xr-x 1 doom doom 84163560 Nov  7  2016 libcudnn.so.5.1.10 -rw-r--r-- 1 doom doom 70364814 Nov  7  2016 libcudnn_static.a

Here you will see 2 symbolic link files, and just copy libcudnn.so.5.1.10 and libcudnn_static.a to /usr/local/cuda/lib64

Make symbolic link files

$ cd /usr/local/cuda/lib64/ $ sudo ln -s libcudnn.so.5.1.10 libcudnn.so.5 $ sudo ln -s libcudnn.so.5 libcudnn.so $ ls -l libcudnn* lrwxrwxrwx 1 root root       13 May 24 09:24 libcudnn.so -> libcudnn.so.5 lrwxrwxrwx 1 root root       18 May 24 09:24 libcudnn.so.5 -> libcudnn.so.5.1.10 -rwxr-xr-x 1 root root 84163560 May 24 09:23 libcudnn.so.5.1.10 -rw-r--r-- 1 root root 70364814 May 24 09:23 libcudnn_static.a

Copy cudnn.h in include directory to /usr/local/cuda/include

$ sudo cp cudnn.h /usr/local/cuda/include/

Hope it will help you!



回答3:

I fixed this just adding the cuda path to my .bashrc

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/

Just have in mind that first you need to go to nvidia Deep Learning page, register and download cuDNN, extract and copy the files from include and lib64 folders into your cuda installation.



回答4:

I have seen a similar error (bottom of this post), but complaining about libcudnn.so.6 instead of libcudart.so.8.0 (see a note below).

Solution:

  1. Download 'cuDNN v6.0 Library for Linux':
  2. Follow the instructions of Alexander Yau above to install the cuDNN v6.0 library.


Note:

the Tensorflow installation instructions (as of 20/Aug/2017) require installing cuDNN v5.1, but my Tensorflow installation (following the instructions for installing in a virtualenv) required cuDNN v6.x (as indicated by the error). I don't know if it is a mistake on my side or a Tensorflow documentation one. Nevertheless, above solution worked for me.


Encountered error:

In [1]: import tensorflow as tf --------------------------------------------------------------------------- ImportError                               Traceback (most recent call last)  in () ----> 1 import tensorflow as tf  /home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/__init__.py in ()      22       23 # pylint: disable=wildcard-import ---> 24 from tensorflow.python import *      25 # pylint: enable=wildcard-import      26   /home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/python/__init__.py in ()      47 import numpy as np      48  ---> 49 from tensorflow.python import pywrap_tensorflow      50       51 # Protocol buffers  /home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py in ()      50 for some common reasons and solutions.  Include the entire stack trace      51 above this error message when asking for help.""" % traceback.format_exc() ---> 52   raise ImportError(msg)      53       54 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long  ImportError: Traceback (most recent call last):   File "/home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in      from tensorflow.python.pywrap_tensorflow_internal import *   File "/home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in      _pywrap_tensorflow_internal = swig_import_helper()   File "/home/haseeb/.virtualenvs/attention_transformer/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper     _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory   Failed to load the native TensorFlow runtime.  See https://www.tensorflow.org/install/install_sources#common_installation_problems  for some common reasons and solutions.  Include the entire stack trace above this error message when asking for help.


回答5:

First, Install CUDA library (version 7.5) from here

Installation Instructions: 1- sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb 2- sudo apt-get update 3- sudo apt-get install cuda

Second, install the cuDNN from here

Third, export cuDNN path:

export LD_LIBRARY_PATH=/usr/local/cuda/lib64/

In case you have an error like "The package libcudnnX needs to be reinstalled", follow those steps here



回答6:

The preceding errors are typically caused by not specifying a version number for the Cuda SDK or cuDNN when you run the configure script. In other words, when running the configure script, always specify a version number in response to the following two questions:

  • Please specify the Cuda SDK version you want to use, e.g. 7.0.
  • Please specify the cuDNN version you want to use.

Don't accept the system defaults.



回答7:

On MacOS, this issue is often caused by bazel running in a sandbox environment, thus not respecting the LD_LIBRARY_PATH set in your local shell. I wouldn't bother going into the merit of deep integration of sandboxing in a build tool.

The simple workaround is to symlink the libraries into /usr/local/lib.

cd /usr/local/lib && ln -s ../cuda/lib/libcudart.8.0.dylib



回答8:

Mysteriously, my libcudnn.so.5 was installed at ~/cuda/lib64. For people like me, you need to change

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:~/cuda/lib64"

to

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/yourusername/cuda/lib64"


回答9:

TensorFlow 1.2.1 is compatible with cuDNN 5.1, but not yet with 6.0. So just install cuDNN 5.1. Besides that you seem to be missing CUDA 8.0.



回答10:

Check the NVIDIA requirements to run TensorFlow with GPU support (link):

  • The NVIDIA drivers associated with CUDA Toolkit 8.0

  • cuDNN v6.0

  • GPU card with CUDA Compute Capability 3.0 or higher

  • The libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface

I installed the cuda v5.1 and the message below still remains:

ImportError: libcudart.so.8.0: cannot open shared object file:   No such file or directory

I so I got pissed off because everything looks fine, so I decide to check my GPU with the command (on Linux):

glxinfo | grep GeForce

And I noticed that my NVIDIA GPU is not supported:

OpenGL renderer string: **GeForce GTX 560M**/PCIe/SSE2

In this link you can find a list, like that:

So my solution was use tensor flow without GPU support. So I do:

pip uninstall tensorflow-gpu

I install whithout support:

pip install tensorflow


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!