使用TensorFlow-GUP并在Ubuntu上安装CUDA cuDNN

为了加速TensorFlow的计算，我们采用TensorFlow的GUP版本。其需要CUDA和cuDNN，本文将以Ubuntu为例。

Requirements
The TensorFlow Python API supports Python 2.7 and Python 3.3+.
The GPU version works best with Cuda Toolkit 8.0 and cuDNN v5.1. Other versions are supported (Cuda toolkit >= 7.0 and cuDNN >= v3) only when installing from sources. Please see Cuda installation for details. For Mac OS X, please see Setup GPU for Mac.

本机环境

操作系统： Linux Mint 18.1 Serena
CPU： Intel® Core™ i5-3210M CPU @ 2.50GHz
GPU： GeForce GT 635M

CUDA安装步骤

安装显卡驱动

System Setting --> Driver Manager 选择合适的驱动

下载CUDA

点击此处进行下载

运行安装CUDA

进入刚刚下载的目录，并在终端中运行
Run sudo sh cuda_8.0.44_linux.run
Follow the command-line prompts
在安装过程中会询问是否安装显卡驱动，由于我们在第一步中已经安装，所以我们选择no（不安装）

Do you accept the previously read EULA? (accept/decline/quit): accept  
You are attempting to install on an unsupported configuration. Do you wish to continue? ((y)es/(n)o) [ default is no ]: y  
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 352.39? ((y)es/(n)o/(q)uit): n  
Install the CUDA 8.0 Toolkit? ((y)es/(n)o/(q)uit): y  
Enter Toolkit Location [ default is /usr/local/cuda-8.0 ]:  
Do you want to install a symbolic link at /usr/local/cuda? ((y)es/(n)o/(q)uit): y  
Install the CUDA 8.0 Samples? ((y)es/(n)o/(q)uit): y  
Enter CUDA Samples Location [ default is /home/kyle ]:

等待完成安装即可。
安装完成后可能会有警告，提示samplees缺少必要的包：

Installing the CUDA Toolkit in /usr/local/cuda-8.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Missing recommended library: libGL.so

Installing the CUDA Samples in /home/kyle ...
Copying samples to /home/kyle/NVIDIA_CUDA-8.0_Samples now...
Finished copying samples.

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-8.0
Samples:  Installed in /home/kyle, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-8.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 361.00 is required for CUDA 8.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

Logfile is /tmp/cuda_install_9426.log

这几个包可以不用管他，不用这几个sample是没有问题的。

配置环境变量

打开shell运行：gedit ~/.bashrc 加入如下内容：

1
2
3

# add cuda
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

立即生效，运行source ~/.bashrc

关于linux环境变量的设置可参考：
Ubuntu中设置环境变量详解
 设置Linux环境变量的方法和区别_Ubuntu

测试是否安装成功

查看CUDA版本

kyle@kyle-Lenovo-M490 ~ $ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

编译 CUDA Samples
进入samples的安装目录
为了节约时间，我们选择其中一个进行编译如：

kyle@kyle-Lenovo-M490 ~ $ cd ~/NVIDIA_CUDA-8.0_Samples/0_Simple/vectorAdd
kyle@kyle-Lenovo-M490 ~/NVIDIA_CUDA-8.0_Samples/0_Simple/vectorAdd $ make
"/usr/local/cuda-8.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o vectorAdd.o -c vectorAdd.cu
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
"/usr/local/cuda-8.0"/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o vectorAdd vectorAdd.o
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../bin/x86_64/linux/release
cp vectorAdd ../../bin/x86_64/linux/release
kyle@kyle-Lenovo-M490 ~/NVIDIA_CUDA-8.0_Samples/0_Simple/vectorAdd $ ./vectorAdd
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

如果没有报错，则安装完成

cuDNN安装步骤

接下来我们安装cuDNN
在下载cuDNN之前，我们需要注册一个账号

cuDNN is freely available to members of the Accelerated Computing Developer Program

注册完账号后我们选择下载
选择cuDNN v5.1 Library for Linux

安装cuDNN非常简单，我们只需解压下载的包，并将其拷贝到lib64和include这两个目录即可

$ cd ~
$ tar -zxf cudnn-8.0-linux-x64-v5.1.tgz
$ cd cuda
$ sudo cp lib64/* /usr/local/cuda/lib64/
$ sudo cp include/* /usr/local/cuda/include/

恭喜你！ cuDNN 已经安装成功

安装完成

至此，CUDA与cuDNN已经安装完成

安装TensorFlow-GUP

安装TensorFlow-GUP非常简单，我们使用pip即可

1	$ pip install tensorflow-gpu

如有问题，参考TensorFlow下载与安装

测试TensorFlow

我们在Python环境中输入import tensorflow看看能否成功导入cuda

Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul  2 2016, 17:53:06)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
>>>

哈哈！恭喜你，完成啦！

参考资料

How to install CUDA Toolkit and cuDNN for deep learning
Ubuntu 16.04 安装 NVIDIA CUDA Toolkit 7.5