1 安装Ubuntu18.04.03 lts
spt@spt-ts:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.3 LTS Release: 18.04 Codename: bionic spt@spt-ts:~$ df -ah Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 794M 1.9M 792M 1% /run /dev/sda6 111G 5.5G 100G 6% / /dev/sda1 454M 112M 315M 27% /boot /dev/sdb1 916G 142M 870G 1% /home # swap设置了6GB
找了一个台式机,全盘格式化后,全新安装的Ubuntu18.04.3 LTS

2 安装NVIDIA显卡驱动
spt@spt-ts:~$ lspci | grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 950] (rev a1)
显卡:gtx 950 驱动和CUDA对应版本好要求:
.png)
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt update ubuntu-drivers devices sudo apt install xserver-xorg-core sudo ubuntu-drivers autoinstall
安装了最新的显卡驱动
.png)
测试显卡驱动安装结果
spt@spt-ts:~$ nvidia-smi
Fri Sep 6 10:50:46 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 950 Off | 00000000:01:00.0 On | N/A |
| 32% 41C P8 10W / 105W | 207MiB / 2000MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 974 G /usr/lib/xorg/Xorg 13MiB |
| 0 1036 G /usr/bin/gnome-shell 48MiB |
| 0 1382 G /usr/lib/xorg/Xorg 70MiB |
| 0 1509 G /usr/bin/gnome-shell 71MiB |
+-----------------------------------------------------------------------------+
spt@spt-ts:~$
3 安装vim ssh服务
对项目没什么用,我主要是想用ssh连接这台机器。
sudo apt install vim openssh-server
4 安装CUDA v10.0
首先根据TensorFlow官方指导,先查好版本兼容性
https://tensorflow.google.cn/install/source 最新版本TensorFlow1.14.0,对应CUDA10.0和cuDNN7.4
.png)
1. Download and Run `sudo sh cuda_10.0.130_410.48_linux.run`
2. Download and Run Patch 1 (Released May 10, 2019)
顺便看清楚卸载方式。因为后面测试不同项目,需要不同版本。很有可能需要卸载,然后安装不同版本。
.............................................. To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin uninstall_cuda_10.0.pl
安装后
spt@spt-ts:~$ df -ah Filesystem Size Used Avail Use% Mounted on sysfs 0 0 0 - /sys proc 0 0 0 - /proc udev 3.9G 0 3.9G 0% /dev devpts 0 0 0 - /dev/pts tmpfs 794M 2.0M 792M 1% /run /dev/sda6 111G 12G 94G 11% /
设置环境变量,在/etc/profile或~/.bashrc的文件后面添加
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
5 安装 cuDNN v7.4.2
Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0
版本号必须匹配上面的CUDA版本
# 下载下面几个文件 Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0
#cuDNN Library for Linux ---> cudnn-10.0-linux-x64-v7.4.2.24.tgz
#cuDNN Runtime Library for Ubuntu18.04 (Deb)
#cuDNN Developer Library for Ubuntu18.04 (Deb)
#cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb)
cuDNN解压安装
spt@spt-ts:~/work/tensorflow$ pwd /home/spt/work/tensorflow spt@spt-ts:~/work/tensorflow$ tar xvf cudnn-10.0-linux-x64-v7.4.2.24.tgz spt@spt-ts:~/work/tensorflow$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include spt@spt-ts:~/work/tensorflow$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 spt@spt-ts:~/work/tensorflow$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
6 安装pip3 virtualenv
# 系统默认安装了最新支持版本python3.6
sudo apt install python3-pip python3-dev python-virtualenv
7 安装TensorFlow-GPU v1.14.0
spt@spt-ts:~/work/tensorflow$ pwd /home/spt/work/tensorflow spt@spt-ts:~/work/tensorflow$ mkdir tsenv spt@spt-ts:~/work/tensorflow$ virtualenv -p python3 tsenv spt@spt-ts:~/work/tensorflow$ cd tsenv/ spt@spt-ts:~/work/tensorflow/tsenv$ source bin/activate (tsenv) spt@spt-ts:~/work/tensorflow/tsenv$ pip3 install --index-url http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com --upgrade tensorflow-gpu # 采用国内源阿里巴巴下载tensorflow-gpu # 或者豆瓣 pip3 install --index-url http://pypi.douban.com/simple --trusted-host pypi.douban.com --upgrade tensorflow-gpu
# 查看安装情况
(tsenv) spt@spt-ts:~/work/tensorflow/tsenv$ pip3 show tensorflow-gpu Name: tensorflow-gpu Version: 1.14.0
# 测试
(tsenv) spt@spt-ts:~/work/tensorflow/tsenv/src$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery (tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ sudo make (tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) cudaGetDeviceCount returned 803 -> system has unsupported display driver / cuda driver combination Result = FAIL
# 结论 驱动和CUDA安装后需要重启,打开桌面环境。再次测试
(tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 950"
CUDA Driver Version / Runtime Version 10.1 / 10.0
CUDA Capability Major/Minor version number: 5.2
Total amount of global memory: 2001 MBytes (2098069504 bytes)
( 6) Multiprocessors, (128) CUDA Cores/MP: 768 CUDA Cores
GPU Max Clock rate: 1304 MHz (1.30 GHz)
Memory Clock rate: 3305 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 1048576 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS
8 至此环境搭建完毕
待测试其他